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Preface 


These proceedings contain the papers selected for presentation at the 14th International 
Symposium on Frontiers of Combining Systems (FroCoS 2023). The symposium was 
held during September 20-22, 2023 at Czech Technical University in Prague (CTU), 
Czech Republic. It was co-located with the 32nd International Conference on Automated 
Reasoning with Analytic Tableaux and Related Methods (TABLEAUX 2023). 

FroCoS is the main international event for research on the development of techniques 
and methods for the combination and integration of formal systems, their modularization 
and analysis. Previous FroCoS meetings have been organized across the world, since 
1996; see Figures | and 2 for a global and a European view of the locations of past and 
present meetings. 


Fig. 1. A global map showing locations of past and current FroCoS meetings 


FroCoS 2023 received 22 high-quality paper submissions, which were evaluated by 
the members of the Program Committee who did a great job at thoroughly evaluating 
these submissions regarding their technical and presentational quality and providing 
helpful feedback to the authors. Reviewing was single-blind and each paper was subject 
to at least three reviews, followed by sometimes extensive discussions within the Program 
Committee and, in three cases, a second round of reviewing. In the end, 14 papers were 
selected for presentation at the symposium and for publication. We have grouped them in 
this volume according to the following topic classification: (1) analysis of programs and 
equations, (2) unification, (3) decidable fragments, (4) frameworks, and (5) higher-order 
theorem proving. 

Together with the Program Committee, we considered suitable candidates to give an 
invited talk, and were delighted to have found five outstanding invited speakers: 
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Fig. 2. A Europe-centric map showing locations of past and current FroCoS meetings in Europe 
and Asia 


— Marta Bílková, Czech Academy of Sciences, Czechia (joint with TABLEAUX 2023) 

— Chad E. Brown, Czech Technical University in Prague, Czechia (joint with 
TABLEAUX 2023) 

— Valentin Goranko, Stockholm University, Sweden (joint with TABLEAUX 2023) 

— Katalin Fazekas, TU Wien, Austria 

— Yoni Zohar, Bar-Ilan University, Israel 


We would like to thank all the people who contributed to making FroCoS 2023 a 
success. In particular, we thank the members of the Program Committee and the external 
reviewers for their excellent, timely work and for providing the authors with insightful 
feedback. Of course we thank the authors for submitting high-quality papers, taking the 
reviewers’ feedback into account, and presenting their work in a way that is accessible to 
the broad FroCoS audience. Next, we thank the invited speakers for their inspiring talks. 
Moreover, we thank the local organisers and the Czech Technical University in Prague 
for organising and supporting FroCoS. Finally, we gratefully acknowledge financial 
support from Springer. 
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Incremental Reasoning in Embedded SAT Solvers 


Katalin Fazekas® 
TU Wien, Austria 


Abstract. Embedding SAT solvers as sub-reasoning engines into more 
complex tools is a common practice in various application domains. For 
instance, SAT-based model checkers exploit modern solvers as black-box 
oracles, while solvers for Satisfiability Modulo Theories (SMT), Maxi- 
mum Satisfiability (MaxSAT) or other combinatorial problems combine 
SAT solvers with various reasoning or optimization engines. Such embed- 
ded SAT solvers are used incrementally in most cases, i.e., the exact same 
SAT solver instance is reused to solve multiple related SAT queries. The 
goal of incremental reasoning is to exploit the shared constraints between 
consecutive SAT queries and thereby avoid repeated work and reduce 
solving time. 

In this talk, first we briefly survey the functionalities supported by 
IPASIR, the standard API of incremental SAT solvers, which integrates 
solvers as black-boxes into larger systems. Then, we present our recently 
proposed extension to that interface which allows us to modify and refine 
SAT queries already during solving and thereby to benefit from incremen- 
tal reasoning even more. The proposed extension, as we demonstrate by 
our experiments, captures the most essential functionalities that are suffi- 
cient to simplify and improve use cases where a more fine-grained interac- 
tion between the SAT solver and the rest of the system is required. We will 
present our experiments where we extended CaDiCaL, a state-of-the-art 
incremental SAT solver, with our proposed interface and evaluated it on 
two representative use cases: enumerating graphs within the SAT modulo 
Symmetries framework (SMS), and embedding it as the main CDCL(T) 
SAT engine in the SMT solver cvc5. Following that, we overview the key 
open challenges in such use cases to efficiently combine some complex 
crucial features of modern SAT solvers, such as inprocessing and proof 
production, with incremental reasoning. At the end, we briefly present 
possible ways to address some of these challenges. 

This is a joint work with Aina Niemetz, Mathias Preiner, Markus 
Kirchweger, Stefan Szeider, and Armin Biere. 


On Datatypes, Synergies, and Unicorns: Recent 
Developments in Theory Combination 


Yoni Zohar® 
Bar-Ilan University, Israel 


Abstract. A Satisfiability Modulo Theories (SMT) solver is a tool that 
takes as input a first-order formula, and determines its T-satisfiability, 
that is, the existence of a first-order structure that satisfies it, as well as 
the axioms of some first-order theory T. Some theories are considered 
primitive, such as the theories of integers, reals, arrays, and lists. Other 
theories are considered combined, as they are obtained by the combina- 
tion of existing theories. Examples include the theory of arrays of integers, 
or of lists of reals. 

Now, assume that you have an SMT solver that supports two theories. 
How hard would it be to extend it so that it supports their combination? 
The classical answer to this question was given by Nelson-Oppen. They 
designed a decision procedure for a given combined theory by first puri- 
fying the input formula to two parts, one for each theory; then guessing 
equalities and disequalities between the shared variables of the two parts; 
and finally calling the two decision procedures for the separate theories on 
the part of the purified formula that is relevant to them, plus the guessed 
set of (dis)equalities. 

The correctness of this combination method requires the two com- 
bined theories to be stably infinite, a model theoretic property related to 
the existence of infinite models. However, not all theories of interest are 
stably infinite. (For example, the theory of fixed-size bit-vectors is not.) 

This state of affairs led to the development of various other com- 
bination methods that rely on various model theoretic notions, such as 
shiny, gentle, and polite theories. For each combination method, the cor- 
responding properties of the theories need to be proven in order to be 
used with that method. And indeed, various theories have been shown to 
admit such properties. 

In this talk I will survey recent results in the field of theory combi- 
nation. First, I will sketch a proof that theories of datatypes (e.g., lists, 
trees) can be combined with any other theory, using the polite combina- 
tion method. Next, I will show how the original Nelson-Oppen method 
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can be integrated together with the polite combination method in a syner- 
getic way that reduces the number of guesses one needs to make. Finally, a 
taxonomy of various model theoretic properties from theory combination 
will be presented, where the properties will be analyzed and compared. 
This will include the description of open problems which relate to a 
certain kind of theories (that are called “unicorns’’). 
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Targeting Completeness: Using Closed 
Forms for Size Bounds of Integer 
Programs 


Nils Lommen®)® and Jürgen Giel ®© 


LuFG Informatik 2, RWTH Aachen University, Aachen, Germany 
lommen@cs.rwth-aachen.de, giesl@informatik.rwth-aachen.de 


Abstract. We present a new procedure to infer size bounds for integer 
programs automatically. Size bounds are important for the deduction of 
bounds on the runtime complexity or in general, for the resource analy- 
sis of programs. We show that our technique is complete (i.e., it always 
computes finite size bounds) for a subclass of loops, possibly with non- 
linear arithmetic. Moreover, we present a novel approach to combine 
and integrate this complete technique into an incomplete approach to 
infer size and runtime bounds of general integer programs. We prove 
completeness of our integration for an important subclass of integer pro- 
grams. We implemented our new algorithm in the automated complexity 
analysis tool KoAT to evaluate its power, in particular on programs with 
non-linear arithmetic. 


1 Introduction 


There are numerous incomplete approaches for automatic resource analysis 
of programs, e.g., [1,2,5,8,10,15,19,21,29,33]. However, also many complete 
techniques to decide termination, analyze runtime complexity, or study mem- 
ory consumption for certain classes of programs have been developed, e.g., 
[3, 4,6, 7, 16,17,20,22,27,34,36]. In this paper, we present a procedure to com- 
pute size bounds which indicate how large the absolute value of an integer vari- 
able may become. In contrast to other complete procedures for the inference of 
size bounds which are based on fixpoint computations [3,6], our technique can 
also handle (possibly negative) constants and exponential size bounds. Similar to 
our earlier paper [27], we embed a procedure which is complete for a subclass of 
loops (i.e., it computes finite size bounds for all loops from this subclass) into an 
incomplete approach for general integer programs [8,19]. In this way, the power 
of the incomplete approach is increased significantly, in particular for programs 
with non-linear arithmetic. However, in the current paper we tackle a completely 
different problem than in [27] (and thus, the actual new contributions are also 
completely different), because in [27] we embedded a complete technique in order 


Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) 
- 235950644 (Project GI 274/6-2). 
© The Author(s) 2023 
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to infer runtime bounds, whereas now we integrate a novel technique in order to 
infer size bounds. As an example, we want to determine bounds on the absolute 
values of the variables during (and after) the execution of the following loop. 


while (x3 > 0) do (21, £2, £3, 14) —(3-2142-22, —5-21—3-29, 03-1, 24 +23) (1) 


We introduce a technique to compute size bounds for loops which admit a 
closed form, i.e., an expression which corresponds to applying the loop’s update 
n times. Then we over-approximate the closed form to obtain a non-negative, 
weakly monotonically increasing function. For instance, a closed form for x3 in 
our example is x3 — n, since the value of 3 is decreased by n after n iterations. 
The (absolute value of this) closed form can be over-approximated by x3 + n, 
which is monotonically increasing in all variables. Finally, each occurrence of 
n is substituted by a runtime bound for the loop. Clearly, (1) terminates after 
at most x3 iterations. So if we substitute n by the runtime bound x3 in the 
over-approximated closed form z3 + n, then we infer the linear bound 2- x3 on 
the size of x3. Due to the restriction to weakly monotonically increasing over- 
approximations, we can plug in any over-approximation of the runtime and do 
not necessarily need exact bounds. 


Structure. We introduce our technique to compute size bounds by closed forms 
in Sect.2 and show that it is complete for a subclass of loops in Sect.3. After- 
wards in Sect. 4, we incorporate our novel technique into the incomplete setting 
of general integer programs. In Sect. 5 we demonstrate how size bounds are used 
in automatic complexity analysis and study completeness for classes of general 
programs. In Sect. 6, we conclude with an experimental evaluation of our imple- 
mentation in the tool KoAT and discuss related work. All proofs can be found 
in [28]. 


2 Size Bounds by Closed Forms 


In this section, we present our novel technique to compute size bounds for loops 
by closed forms in Theorem 7. We start by introducing the required preliminar- 
ies. Let V = {a1,...,2a} be a set of variables. F(V) is the set of all formulas 
built from inequations p > 0 for polynomials p € Q[V], A, and V. A loop (y,7) 
consists of a guard y € F(V) and an update 7 : V — Z[V] mapping variables 
to polynomials. A closed form c1% (formally defined in Definition 1 below) is 
an expression in n and in the (initial values of the) variables x1,...,v%q which 
corresponds to the value of x; after iterating the loop n times. For our purpose 
we only need closed forms which hold for all n > no for some fixed ng € N. More- 
over, we restrict ourselves to closed forms which are so-called normalized poly- 
exponential expressions [16]. Nonetheless, our procedure works for any closed 
form expression with a finite number of arithmetic operations (i.e., the number 
of operations must be independent of n). We extend the application of functions 
like 7 : V — Z[V] also to polynomials, vectors, and formulas, etc., by replacing 
each variable v in the expression by 7(v). So in particular, (j20m)(x) = n2(m(x)) 
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stands for the polynomial 7;(x) in which every variable v is replaced by 12(v). 
Moreover, 7” denotes the n-fold application of 7. 

We call a function o : V —> Z a state. By o(exp) or o(p) we denote the 
number resp. Boolean value which results from replacing every variable v by the 
number o(v) in the arithmetic expression exp or the formula g. 


Definition 1 (Closed Forms). For a loop (p,n), an arithmetic expression 
cl” is a closed form for x; with start value no € N if c1” = Di<j<e azn -b7 
with €,a; € N, bj € A,’ aj; € A[V], and for alla: VU {n} > Z with o(n) > no 
we have o(cl*) = a(n" (a;)). Similarly, we call cl = (c1™,...,c1*4) a closed 
form of the update 7 (resp. for the loop (y,n)) with start value no if for all 
1<i<d,cl™ are closed forms for x; with start value no. 


Example 2. In Sect.3 we will show that for the loop (1), a closed form for x; 
(with start value 0) is cl®! = $-a-(—i)"+$-@-i" where a = (1+3i)-21+2i-ao. 
Here, @ denotes the complex conjugate of a, i.e., the sign of those monomials is 
flipped where the coefficient is a multiple of the imaginary unit i. A closed form 


for v4 (also with start value 0) is c1%4 = a4+n-(%+23+23—a3-n— 3+ n; 


Our aim is to compute bounds on the sizes of variables and on the runtime. 
As in [8,19], we only consider bounds which are weakly monotonically increasing 
in all occurring variables. Their advantage is that we can compose them easily 
(i.e., if f and g increase monotonically, then so does f o g). 


Definition 3 (Bounds). The set of bounds B is the smallest set with N = 
NU {w} € B, y C B, and {b1 + b2, by - bo, ke} C B for all k € N and by, b2 EB. 


Size bounds should be bounds on the values of variables up to the point 
where the loop guard is not satisfied anymore for the first time. To define size 
bounds, we introduce the runtime complexity of a loop (whereas we considered 
the runtime complexity of arbitrary integer programs in [8,19,27]). Let X denote 
the set of all states øo : V — Z and let |o| be the state with |o|(x) = |o(x)| for 
alla eV. 


Definition 4 (Runtime Complexity for Loops). The runtime complexity 
of a loop (y,n) is rc: X — N with re(o) = inf{n € N | o(n"(-))}, where 
inf Ø = w. An expression r € B is a runtime bound if |o|(r) > re(o) for all 
ae di. 


Example 5. The runtime complexity of the loop (1) is re(o) = max(0,0(a3)). 
For example, x3 is a runtime bound, as |o|(a#3) > max(0,o(a3)) for all states 
aed. 


A size bound on a variable x is a bound on the absolute value of x after n 
iterations of the update 7, where n is bounded by the runtime complexity. In 
contrast to the definition of size bounds for transitions in integer programs from 
[8], Definition 6 requires that size bounds also hold before evaluating the loop. 


1 A is the set of algebraic numbers, i.e., the field of all roots of polynomials in Z[z]. 
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Definition 6 (Size Bounds for Loops). SB: V — B is a size bound for 
(y,n) if for alla E V and allo € X, we have |o|(SB(x)) > sup{lo(n”"(x))] | n < 
rc(a)}. 


For any algebraic number c € A, as usual ||c|] is the smallest natural number 
which is greater or equal to c’s absolute value. Similarly, for any poly-exponential 
expression p = } 2; (D0; ¢i,j-Gi,j)'n% -bf where cij € A and the j,; are normalized 
monomials of the form a" +.. x3, [|p|] denotes 55; (Xa fleis] » Big) nt flo” 

We now determine size bounds by over-approximating the closed form c1” 
by the non-negative, weakly monotonically increasing function [|cl”|]. Then we 
substitute n by a runtime bound r (denoted by “[n/r]”). Due to the monotonicity, 
this results in a bound on the size of x not only at the end of the loop, but 
also during the iterations of the loop. Since the closed form is only valid for n 
iterations with n > no, we ensure that our size bound is also correct for less than 
no iterations by symbolically evaluating the update, where we over-approximate 
maxima by sums. As mentioned, see [28] for the proofs of all new results. 


Theorem 7 (Size Bounds for Loops with Closed Forms). Let cl be a closed 
form for the loop (p,n) with start value no and let r € B be a runtime bound. 
Then the (absolute) size of x € V is bounded by sb” =[|c1*|][n/r] + Vo<ieng 1 (2). 
Hence, the function SB with SB(x) = sb” for allx € V is a size bound for (p,n). 


Example 8. As mentioned, for the loop (1), a closed form for xı with start value 
0 is cl™ = 4 -a (—i)” + 4-a- i” where a = (1 + 3i) - sı + 2i - £2. Hence, 
Hear = [3 a (a a] = S e + [lil] e wa) i” + 
(1# e- iaa) iN" = 4-2, +2-29, as i = n = | 22] =2 
and [lij] = [| — i|] = 1. So our approach infers linear size bounds for xı and x2 
(the similar computations for £2 are omitted) while [8] only infers exponential 
size bounds. 

As this over-approximation does not depend on n, it directly yields a size 
bound, i.e., sb”! = [|c1*|]. In contrast, in the over-approximation [|cl™4|] = 
x4 +n (1+ z3 4+ 23+ z3- n+n +n’), we have to replace n by a runtime bound 
like x3. Thus, we obtain the overall size bound sb™4 = z4 +3- r3 +2. xe + v3. 


Although this section focused on closed forms which are poly-exponential 
expressions, our technique is applicable to all loops where we can compute over- 
approximating bounds for the closed form and the runtime complexity. For exam- 
ple, the update n(x) = x? has the closed form z”), but it does not admit a 
poly-exponential closed form due to x’s super-exponential growth. However, by 
instantiating n by a runtime bound, we can still compute a size bound for this 
update. The reason for focusing on poly-exponential expressions is that we can 
compute such a closed form for all so-called solvable loops automatically, see 
Sect. 3. 
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3 Size and Runtime Bounds for Solvable Loops 


In this section, we present a class of loops where our technique of Theorem 7 
is “complete”. The technique relies on the computation of suitable closed forms 
and of runtime bounds. In Sect. 3.1, we show that poly-exponential closed forms 
can be computed for all solvable loops [17,23,25,26,32,36]. Then we prove in 
Sect. 3.2 that finite runtime bounds are computable for all terminating solvable 
loops with only periodic rational eigenvalues. 

A loop (y,7) is solvable if ņ is a solvable update (see Definition 9 below 
for a formal definition), which partitions V into blocks S1,...,S,, (and loop 
guards y are not relevant for closed forms). Each block allows updates with cyclic 
dependencies between its variables and non-linear dependencies on variables in 
blocks with lower indices. 


Definition 9 (Solvable Update [17,23,25,26,32,36]). An update n : V —> 
Z|V] is solvable if there exists a partition S,,...,Sm of {@1,...,a} such that 
for alll <i <m we have ns, = As, ` £s; + ps, for an As, € ZISiIx|Si1 and a 
ps, € Ll; 2; S;]l®l, where ns, is the vector of all n(x;) and xs, is the vector 
of all x; with j E€ Si. The eigenvalues of a solvable loop are defined as the union 
of the eigenvalues of all matrices As,. The loop is homogeneous if ps, = 0 for 
alll <i<m. 


Example 10. The loop (1) is an example for a solvable loop using the partition 
Sj = {21,22}, S2 = {x3}, and S3 = {x4}. 


The crucial idea for our results in Sect.3.1 and 3.2 is to reduce the prob- 
lem of finding closed forms and runtime bounds from solvable loops to triangu- 
lar weakly non-linear loops (twn-loops) [16,17,20]. A twn-update is a solvable 
update where each block S; has cardinality one. Thus, a twn-update is trian- 
gular, i.e., the update of a variable does not depend on variables with higher 
indices. Furthermore, the update is weakly non-linear, i.e., a variable does not 
occur non-linear in its own update. We are mainly interested in loops over Z, 
but to handle solvable updates, we will transform them into twn-updates with 
coefficients from A. 


Definition 11 (TWN-Update [16,17,20]). An update ņ : V — A[V] is twn 
if for all 1 < i < d we have n(x;) = G- £i + pi for some ci E€ A and some 
polynomial p; E€ Azı, ...,£i—1]. A loop with a twn-update is called a twn-loop. 


Clearly, (1) is not a twn-loop due to the cyclic dependency between zı and 22. 


3.1 Closed Forms for Solvable Loops 


Lemma 12 (which extends [17, Thm. 16] from solvable updates with real eigenval- 
ues to arbitrary solvable updates) illustrates that one can transform any solvable 
update 7, into a twn-update 7 by an automorphism V. Here, J is induced by 
the change-of-basis matrix of the Jordan normal form of each block of ns. Note 
that the Jordan normal form is always computable in polynomial time (see [9]). 
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Lemma 12 (Transforming Solvable Updates (see [17], Thm. 16). Let 
ns be a solvable update. Then V : V — A[V] is an automorphism, where V is 
defined by 0(S) = P - ag for each block S, where J(As) = P- As +P! is the 
Jordan normal form of As. Furthermore, m, = V} on, o is a twn-update. 


Example 13 To illustrate Lemma 12, we transform the solvable update ns of (1) 
into a twn-update m. As the blocks S2 = {x3} and S3 = {x4} have cardinality 
one, we only have to consider Sı = {x1, 22}. The restriction of 7, to Sj is (i) — 
As, < (71) with As, = (5 3). So we get the Jordan normal form J(As,) = 


T2 


P. As, P= =(59 where P = (; 2 ga) and P=? = (#479 =n), 


Thus, we have the following automorphism %9 and its inverse 97t: 
CC). she) ela aN re) 
Hojs ed Game a) e) 


Hence, m = VT! o ns o V is the following twn-update: 


m(t1)=—i-21, m(x2)=i-22, m(es)=23—-1, mrs) = 244-03 


The reason for transforming solvable updates to twn-updates is that for 
the latter, we can re-use our previous algorithm from [16] to compute poly- 
exponential closed forms. While [16] only considered updates with linear arith- 
metic over Z, it can directly be extended to twn-updates over A. 


Lemma 14 (Closed Forms for TWN-Updates (see [16])). Let 7 be a 
twn-update. Then a (poly-exponential) closed form is computable for n. 


Example 15. For m from Example 13, we obtain the following closed form (with 
start value 0): cl, = ((—i)” -x1,i” -£2, £3- N, tatn(z +23+23—23-n—2+ ny), 


So to obtain a closed form of a solvable update ns, we first transform it into 
a twn-update 7 via Lemma 12, and then compute the closed form cl, of m 
(Lemma 14). We now show how to obtain a closed form for 7, from cl,. 


Theorem 16 (Closed Forms for Solvable Updates). Letn, be a solvable 
update and V be an automorphism as in Lemma 12 such that ne = VT! o ns o V 
is a twn-update. If cl, is a closed form of m with start value no, then cl, = 
Vo cl, oTt is a closed form of ns with start value no. 


Example 17. In Example 13 we transformed 7, into the twn-update 7 via an 
automorphism V and in Example 15, we gave a closed form cl, of m. Thus, by 
Theorem 16, we can infer a closed form cl, = V o cl, o V7! of ns. For example, 
we compute a closed form for x; with start value 0 (c17? can be inferred in a 
similar way): 
cl? = (1(1—3)-a1 — 4743) - x2) [v/c1} | v € V] lv/9(v) |v € V] 

(Ei 3) - (-i)"- 2, (i+ 3)-i”- a) [u/P(v) | v € Vi 
$((1 + 3i) -x1 + 2i- x2) - (—i)” + $((1 — 3i) - wy — 2i- £2) - i”. 

> am ——_ m 


a a 
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3.2 Periodic Rational Solvable Loops 


In Sect. 3.1, we discussed how to compute closed forms for solvable updates (by 
transforming them to twn-updates). However to compute size bounds, we have 
to instantiate the variable n in the closed forms by runtime bounds (Theorem 7). 
In [20], it was shown that (polynomial) runtime bounds can always be computed 
for terminating twn-loops over the integers. However, in general, transforming 
solvable loops via Lemma 12 yields twn-updates which may contain algebraic 
(complex) numbers. We now show that for the subclass of terminating periodic 
rational solvable loops, our approach is “complete” (i.e., finite runtime bounds 
and thus, also finite size bounds are always computable). 


Definition 18 (Periodic Rational [25]). A number A € A is periodic rational 
if AP € Q for some p € N with p > 0. The period of À is the smallest such p with 
AP € Q. A solvable loop is periodic rational (i.e., it is a prs loop) with period p 
if all its eigenvalues À are periodic rational and p is the least common multiple of 
all their periods. A prs loop is a unit prs loop if |A| < 1 for all its eigenvalues A. 


So i, —i, and v2-i are periodic rational with period 2, while V2+i is not periodic 
rational. The following lemma from [25] gives a bound on the period of prs loops 
and thus yields an algorithm to detect prs loops and to compute their period. 


Lemma 19 (Bound on the Period [25]). Let A € Z”*”. If A is a periodic 
rational eigenvalue of A with period p, then p < n°. 


Now we show that by chaining (i.e., by performing p iterations of a prs loop 
with period p in a single step), one can transform any prs loop into a solvable loop 
with only integer eigenvalues. Then, our previous results on twn-loops [17,20] 
can be used to infer runtime bounds for these loops. 


Definition 20 (Chaining Loops). Let L = (p,n) be a loop and p € N \ {0}. 
Then Lp = (Pp, Mp) results from iterating L p times, i.e., pp = p A nly) A 
n(n(p)) A... A ntle) and mlv) = 7?(v) for all v € V. 


Example 21. The eigenvalues +i of (1) have period 2. Chaining yields 
(eanl), n’): 


while (x3 >0 A z3 > 1) do (x1, £2, £3, £4) — (—21, —£2, £3 —2, x4+(x3—1)”+x3) (2) 


Due to Lemma 12 we can transform every solvable update into a twn-update 
by a (linear) automorphism ¥V. For prs loops, 9s range can be restricted to Q[V], 
i.e., one does not need algebraic numbers. So, we first chain the prs loop L and 
then compute a Q-automorphism V transforming the chained loop Lp into a 
twn-loop L via Lemma 12. Then we can infer a runtime bound for L as in [20]. 
The reason is that all factors c; in the update of L, are integers and thus, we can 
compute a closed form >? ; aj-n%/ -b? such that aj; € Q[V] and bj € Z. Afterwards, 
the runtime bound for L can be lifted to a runtime bound for the original loop 
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by reconsidering the automorphism ð. Similarly, in order to prove termination 
of the prs loop L, we analyze termination of L; on 0(Z*) = {0(a) | x € Z9}.? 


Lemma 22 (Runtime Bounds for PRS Loops). Let L be a prs loop with 
period p and let Lp = (Pp, Np) result from chaining as in Definition 20. From np, 
one can compute a linear automorphism 0 : V — Q[Y] as in Lemma 12, such 
that: 


(a) Lp is solvable and only has integer eigenvalues. 

(b) (91 om o V) : V— Q[V] is a twn-update as in Definition 11 such that all 
GEZ. 

(c) Li = (Yt, m) with pe = 9! (pp) and m = O tonpoð is a twn-loop. Moreover, 
the following holds: 
e L terminates on Z? iff 
e L, terminates on Zi iff 
e L, terminates on 0(Z*) = {0(x) | x € Z}. 

(d) If ris a runtime bound? for L, then p-[|0(r)||+p—1 is a runtime bound for 
L. 


Runtime prs loop L chaining Ly v: V > Qy] Lt with m : V > Q[V] 
— — > 
Bound: p: [|9(r)|]+P—1 Lemma 22 (a) MOI] Lemma 22 (b) By: [1720] 
I 
1 
a 22 a 
Thm. 7 Lemma 22 (c) & (d) Lemma 22 (c) & (d) 
| 
y Lemma 12 by 0’: V > A[V] F ag j 
; _ solvable loop L y Li, with n; : V > Aly] 
Size Bound: cl, > cl, by [16] 


Thm. 16 


Fig. 1. Illustration of Runtime and Size Bound Computations 


Since we can detect prs loops and their periods by Lemma 19, Lemma 22 
allows us to compute runtime bounds for all terminating prs loops. This is illus- 
trated in Fig. 1: For runtime bounds, L is transformed to Lp by chaining and Lp 
is transformed further to L; by an automorphism V. The runtime bound r for L4 
can then be transformed into a runtime bound for Lp and further into a runtime 
bound for L. For size bounds, L is directly transformed to a twn-loop Li, by an 
automorphism v’. The closed form cl, obtained for Li is transformed via the 
automorphism v into a closed form cl, for L. Then the runtime bound for L is 
inserted into this closed form to yield a size bound for L. So in Fig. 1, standard 
arrows denote transformations of loops and wavy arrows denote transformations 
of runtime bounds or closed forms. 


? By [17], termination of L; on (Z?) is reducible to invalidity of a formula Ja € 
Q* yay A Ers. Here, Yoga) holds iff £ € V(Z°) and Er, holds iff Li does not 
terminate on a. As shown in [17], non-termination of linear twn-loops with integer 
eigenvalues is NP-complete and it is semi-decidable for twn-loops with non-linear 
arithmetic. 

3 More precisely, |o|(r) > inf{n € N | o(n?(-~+))} must hold for all o : VY > 0(Z"). 
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Theorem 23 (Completeness of Size and Runtime Bound Computation 
for Terminating PRS Loops). For all terminating prs loops, polynomial 
runtime bounds and finite size bounds are computable. For terminating unit prs 
loops, all these size bounds are polynomial as well. 


Example 24. For the loop L from (1), we computed L, for p = 2 in (2), see 
Example 21. As L, is already a twn-loop, we can use the technique of [20] 
(implemented in our tool KoAT) to obtain the runtime bound z3 for Lp. Lemma 
22 yields the runtime bound 2-23 + 1 for the original loop (1). Of course, here 
one could also use (incomplete) approaches based on linear ranking functions 
(also implemented in KoAT, see, e.g., [8,19]) to directly infer the tighter runtime 
bound z3 for the loop (1). 


4 Size Bounds for Integer Programs 


Up to now, we focused on isolated loops. In the following, we incorporate our 
complete approach from Sect. 2 and 3 into the setting of general integer programs 
where most questions regarding termination or complexity are undecidable. For- 
mally, an integer program is a tuple (V, £, lo, T) with a finite set of variables V, 
a finite set of locations £, a fixed initial location fọ € £, and a finite set of tran- 
sitions T. A transition is a 4-tuple (£, p,n, ’) with a start location £ € L, target 
location l € L\ {fo}, guard p € F(V), and update 7: V > Z[V]. To simplify the 
presentation, we do not consider “temporary” variables (whose update is non- 
deterministic), but the approach can easily be extended accordingly. Transitions 
(lo, -, -, -) are called initial and To denotes the set of all initial transitions. 


ti ¢ y = (æ > 0) t3 : p = (z5 > 1) 


nizi) = 3-1 +2- a2 

Ss E ne E ee nes) = = 2-45 ae = 3-25 
_ (x3 =z; n(xa) = £3 

ules) = rs — 1 n(z5) = £5 — 1 


n(z4) = za + 25 


© aA a IDOS, : py = (xı > 0) 
—> 
Co) n(xı) = zı — 1 


to: p = (z3 > 0A zs > 0) 
Fig. 2. An Integer Program with Non-Linear Size Bounds 


Example 25. In the integer program of Fig.2, we omitted identity updates 
n(v) = v and guards where g is true. Here, V = {x1,..., £5} and L = {40, 4, l2}, 
where 4o is the initial location. Note that the loop in (1) corresponds to transition 
ti. 


Definition 26 (Correspondence between Loops and Transitions). Let 
t = (L, p,n, £) be a transition with p E€ F(V') for some variables V’ C V such 
that n(x) = for all x E€ V\V' and n(x) € Z[V'] for all x € V'. A loop (y"', 7’) 
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with yp’ E F({x1,...,ta}) and y : {x£1,..., £a}  Z[{x1,...,xa}] corresponds 
to the transition t via the variable renaming r : {x1,...,%a} > V' if p is r(y’) 
and for alll < i < d we have n(n(z;)) = n(n (x;)). 


To define the semantics of integer programs, an evaluation step moves from 
one configuration (£,o) € £ x X to another configuration (¢’, 0’) via a transition 
(£,p,n, Z) where o(p) holds. Here, o’ is obtained by applying the update 7 on 
o. From now on, we fix an integer program P = (V, L, bo, T). 


Definition 27 (Evaluation of Programs). For configurations (f,c0), (V, o") 
and t = (4,9,7,&) € T, (60) —: (€,0’) is an evaluation step if l = &, 
l = L, olp) = true, and o(n(v)) = o'(v) for all v € V. Let >r = User `i 
where we also write — instead of >, or >r. Let (€9,00) >" (Lk, 0%) abbreviate 
(£0,090) 3... —> (lk, 0%) and let (£,0) 3* (U, o) if (€,0) =" (L, o') for some 
k>0. 


Example 28. If we encode states as tuples (o(21),...,0(a5)) € Z°, then 
(—6, —8,2,1,1) > (—6,—-8,2, 1,1) 7 (6,8,0,6,1) —,, (6,8,0,6,1) -$ 
(0, 8, 0,6, 1). 


Now we define size bounds for variables v after evaluating a transition t: 
SB(t,v) is a size bound for v w.r.t. t if for any run starting in co € X, 
loo|(SB(t, v)) is greater or equal to the largest absolute value of v after eval- 
uating t. 


Definition 29 (Size Bounds [8,19]). A function SB: (IT x V) > B is a 
(global) size bound for the program P if for all (t,x) E€ T xV and all states oo € X 
we have |oo|(SB(t,x)) > supt{lo’(x)| | 40 € L. (£o, 00) (>* o +) (0°) f. 


Later in Lemma 35, we will compare the notion of size bounds for transitions 
in a program from Definition 29 to our earlier notion of size bounds for loops 
from Definition 6. 


Example 80. As an example, we give size bounds for the transitions to and t3 in 
Fig. 2. Since tp does not change any variables, a size bound is SB(to, xi) = x; for 
all 1 <i < 5. Note that the value of x5 is never increased and is bounded from 
below by 0 in any run through the program. Thus, SB(t3, £3) = 75 = SB(t3, z5). 
Similarly, we have SB(t3, £1) = 2-25, SB(t3, £2) = 3 - £5, and SB(t3, £4) = 23. 


To infer size bounds for transitions as in Definition 29 automatically, we 
lift local size bounds (i.e., size bounds which only hold for a subprogram with 
transitions 7” C 7 \ Zo) to global size bounds for the complete program. For the 
subprogram, one considers runs which start after evaluating an entry transition 
of T’. 


Definition 31 (Entry Transitions [8]). Let Ø + T' C T \ To. The entry 
transitions of T’ are Ey = {t | t=(-,-, L ET \T’ and there is a (£,-,-,-) € 
T'}. 
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Example 32. For the program in Fig. 2, we have Etn} = {to, t3} and Egr} = {to}. 


Definition 33 (Local Size Bounds). Let Ø 4 T’ CT \ 7 andt’ € T. 
SBy : V — B is a local size bound for t' w.r.t. T’ if for alla € V and allo € X4 
|o|(SBy (x)) > sup{|o" (x)| | dé € £, (~, ~- £) € Err. (6,0) (>F ° >v) (C0). 


Theorem 34 below yields a novel modular procedure to infer (global) size 
bounds from previously computed local size bounds. A local size bound for a 
transition t w.r.t. a subprogram 7’ C T \ Jọ is lifted by inserting size bounds 
for all entry transitions. Again, this is possible because we only use weakly 
monotonically increasing functions as bounds. Here, “b[v/p, | v € V]? denotes 
the bound which results from replacing every variable v by p, in the bound b. 


Theorem 34 (Lifting Local Size Bounds). Let Ø 4 T' CT \ To, let SBy 
be a local size bound for a transition t w.r.t. T' and let SB : (T xV) = B 
be a size bound for P. Let SB'(t', x£) = ree, SBe (x) [v/SB(r,v) | v € V] and 
SB'(t,x) = SB(t,x) for allt At. Then SB’ is also a size bound for P. 


To obtain local size bounds which can then be lifted via Theorem 34, we 
look for transitions tz, that correspond to a loop L and then we compute a size 
bound for L as in Sect.2 and 3. The following lemma shows that size bounds 
for loops as in Definition 6 indeed yield local size bounds for the corresponding 
transitions.” 


Lemma 35 (Local Size Bounds via Loops). Let SB, be a size bound for 
a loop L (as in Definition 6) which corresponds to a transition ty via a variable 
renaming n. Then 70 SB, on™! is a local size bound for ty w.r.t. {t} (as in 
Definition 83). 


Example 36. SB, (x4) = z4 +3- £3 4+ 2-23 + 23 is a size bound for z4 in the 
loop (1), see Example 8. This loop corresponds to transition tı in the program 
of Fig. 2. Since Eg} = {to, t3} by Example 32, Theorem 34 yields the following 
(non-linear) size bound for x4 in the full program of Fig. 2 (see Example 30 for 
SB(to,v) and SB(t3, v)): 


SB(t1, v4) = SBr (x4) [v/SB(to, v) | v € V] + SBz (x4) [v/SB(ts, v) | v € V] 


ag) + (z3 +3- r? +2- x2 + zs) 


= (z4 +3- r3 +2 r? 


=2. g3 +2- £3 +3. r3 +a,+25+2-224+3-23 


Analogously, we infer the remaining size bounds SB(t,,2x;), e.g., SB(t1, 21) = 
(4: x1 +2- x2) [v/SB(to, v) | vE V] + (4-41 +2- x2) [v/SB(ts3,v) | vE V] = 4- zı + 
2. z2 + 14. T5. 


* To simplify the formalism, in this definition, we consider every possible configuration 
(€,0) and not only configurations which are reachable from the initial location £o. 

5 Local or global size bounds for transitions only have to hold if the transition is indeed 
taken. In contrast, size bounds for loops also have to hold if there is no loop iteration. 
This will be needed in Theorem 38 to compute local size bounds for simple cycles. 
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Our approach alternates between improving size and runtime bounds for 
individual transitions. We start with SB(to,x) = |n(«)| for initial transitions 
to € To where 77 is to’s update, and SB(t,_) = w for t € T \ Jp. Here, similar to 
the notion [|p|] in Sect.2, for every polynomial p = >> j Cj ' Bj with normalized 
monomials 6j, |p| is the polynomial >`; |c;| - 8j. To improve the size bounds of 
transitions that correspond to (possibly non-linear) solvable loops, we can use 
closed forms (Theorem 7) and the lifting via Theorem 34. Otherwise, we use an 
existing incomplete technique [8] to improve size bounds (where [8] essentially 
only succeeds for updates without non-linear arithmetic). In this way, we can 
automatically compute polynomial size bounds for all remaining transitions and 
variables in the program of Fig. 2 (e.g., we obtain SB(t2,£1) = SB(t1, £1) = 
4- £1 +2- £2 +14- z5). 

Both the technique from [8] and our approach from Theorem 7 rely on run- 
time bounds to compute size bounds. On the other hand, as shown in [8,19,27], 
size bounds for “previous” transitions are needed to infer (global) runtime 
bounds for transitions in a program. For that reason, the alternated compu- 
tation resp. improvement of global size and runtime bounds for the transitions 
is repeated until all bounds are finite. We will illustrate this in more detail in 
Sect. 5. 

In Definition 26 and Lemma 35 we considered transitions with the same start 
and target location that directly correspond to loops. To increase the applica- 
bility of our approach, as in [27] now we consider so-called simple cycles, where 
iterations through the cycle can only be done in a unique way. So the cycle must 
not have subcycles and there must not be any indeterminisms concerning the 
next transition to be taken. Formally, C = {t1,...,tn} C T is a simple cycle 
if there are pairwise different locations ¢1,...,, such that ti = (@;,-,-, €i41) 
for 1 < i < n— 1 and tn = (€n,-,-,41). To handle simple cycles, we chain 
transitions.® 


Definition 37 (Chaining (see, e.g., [27])). Let ti,...,tn E€ T where ti = 
(Li, Pi, Ni, li+1) for all 1 < i < n—1. Then the transition tı *...* ty = 
(4, p,n, n41) results from chaining t1,...,t, where 


y= 91 A mp2) A n2e(m(¥s)) A- A mn- (Yn) - ++) 
n(v) = Mn(...m(v)...) for al v E V, ie, 7 =Mo...0m. 


Now we want to compute a local size bound for the transition tn w.r.t. a 
simple cycle C = {t,,...,t,} where a loop L corresponds to tı *...* ty via 
m. Then a size bound SB, for the loop L yields the size bound m o SBy o 
nx! for tn regarding runs through C starting in tı. However, to obtain a local 
size bound SB;, w.r.t. C, we have to consider runs starting after any entry 
transition (_,-,-,¢:) € Ec. Hence, we use |n(...mi(t(SBr(m71(x))))...)| for 
any (_,-,-,£;) € Ec. In this way, we also capture evaluations starting in 4;, i.e., 
without evaluating the complete cycle. 


6 The chaining of a loop L in Definition 20 corresponds to p — 1 chaining steps of a 
transition tz via Definition 37, i.e., to tL *...* tr. 
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Theorem 38 (Local Size Bounds for Simple Cycles). Let C = {ti,...,tn} 
C T be a simple cycle and let SBz be a size bound for a loop L which corresponds 
totı*...xtn via a variable renaming t. Then a local size bound SB, fort, w.r.t. 


C is SBi, (2) = Dicien (tebe Hnl---T(™(SBz(a-*(a))))...)f 


Example 39. As an example, in the program of Fig.2 we replace tı = (41, £3 > 
0,71, l1) by tia = (€1, true, 1a, l) and tip = (44,23 > 0, mp, ¢1) with a new 
location £4, where malv) = m (v) for v € {21, x2}, malv) = m (v) for v E€ {x3, £4}, 
and ma resp. 71» are the identity on the remaining variables. Then {t1a, tin} 
forms a simple cycle and Theorem 38 allows us to compute local size bounds 
SB, and SBa, w.r.t. {tia, tin}, because the chained transitions tia * tip = ti 
and t1,*t1, both correspond to the loop (1). They can then be lifted to global size 
bounds as in Example 36 using size bounds for the entry transitions Etta 1,,} = 
fetal: 


This shows how we choose t’ and T’ when lifting local size bounds to global 
ones with Theorem 34: For a transition t’ we search for a simple cycle T’ such 
that chaining the cycle results in a twn- or suitable solvable loop and the size 
bounds of Ey are finite. For all other transitions, we compute size bounds as 
in [8]. 


5 Completeness of Size and Runtime Analysis 
for Programs 


For individual loops, we showed in Theorem 23 that polynomial runtime bounds 
and finite size bounds are computable for all terminating prs loops. In this 
section, we discuss completeness of the size bound technique from the previous 
section and of termination and runtime complexity analysis for general integer 
programs. We show that for a large class of programs consisting of consecutive 
prs loops, in case of termination we can always infer finite runtime and size 
bounds. 

To this end, we briefly recapitulate how size bounds are used to compute 
runtime bounds for general integer programs, and show that our new technique 
to infer size bounds also results in better runtime bounds. We call RB : T — Ba 
(global) runtime bound if for every transition t € T and state ao € X, |oo|(RB(t)) 
over-approximates the number of evaluations of t in any run starting in (0, go). 


Definition 40 (Runtime Bound [8,19]). A function RB: T — B is a 
(global) runtime bound if for allt € T and all states 09 E€ X, we have 
|oo|(RB(t)) > sup{n EN |3 W, 0). (£o, o0) (>F ° =+)” (C,0')}. 


For our example in Fig.2, a global runtime bound for to, te, and t3 is 
RB(to) = 1 and RB(t2) = RB(t3) = z5, as z5 is bounded from below by 
ts’s guard a5 > 1 and the value of x5 decreases by 1 in ts, and no transition 
increases 25. 


16 N. Lommen and J. Giesl 


To infer global runtime bounds automatically, similar as for size bounds, we 
first consider a smaller subprogram 7” C T and compute local runtime bounds 
for non-empty subsets 7% C TJ’. A local runtime bound measures how often 
a transition t € 7X can occur in a run through 7’ that starts after an entry 
transition r € Er.. Thus, local runtime bounds do not consider how many T’- 
runs take place in a global run and they do not consider the sizes of the variables 
before starting a T’-run. We lift these local bounds to global runtime bounds 
for the complete program afterwards. 


Definition 41 (Local Runtime Bound [27]). Let @ # TZ C T' C T. 
RB, € B is a local runtime bound for T? w.r.t. T’ if for allt € TZ, all 
r € Er: with r = (l, ,-,-), and allo € X, we have |o|(RBr:) > sup{n€ N | 
Joo, (l,0"). (€0,00) >7 ° >r (6,0) (>F 0 >)” (C, 0')}. 


Example 42. In Fig. 2, local runtime bounds for Tf = T’ = {tı} and for TZ = 
T’ = {ta} are RB} = x3 and RBy,,} = x. Local runtime bounds can often 
be inferred automatically by approaches based on ranking functions (see, e.g., 
[8]) or by the complete technique for terminating prs loops (see Theorem 23). 


If we have a local runtime bound RB, w.r.t. T’, then setting RB(t) to 
ree, RB(r) - (RBrxz [v/SB(r, v) | vEV)) for all t € T! yields a global runtime 
bound [27]. Here, we over-approximate the number of local T’-runs which are 
started by an entry transition r € Ez by an already computed global runtime 
bound RB(r). Moreover, we instantiate each v € V by a size bound SB(r,v) to 
consider the size of v before a local T’-run is started. So as mentioned in Sect. 4, 
we need runtime bounds to infer size bounds (see Theorem 7 and the inference of 
global size bounds in [8]), and on the other hand we need size bounds to compute 
runtime bounds. Thus, our implementation alternates between size bound and 
runtime bound computations (see [8,27] for a more detailed description of this 
alternation). 


Example 43. Based on the local runtime bounds in Example 42, we can compute 
the remaining global runtime bounds for our example. We obtain RB(t1) = 
RB(to) + (xs [v/SB(to, v) | v € V]) + RB(t3) - (a3 [v/SB(tz, v) | v € V)) = z3 + 23 
and RB(t4) = RB(t2) - (xı [v/SB(te,v) | v E€ VJ) = z5 : (4-01 +2- £2 +14- z5). 
Thus, overall we have a quadratic runtime bound >?) <;<5 RB(ti). Note that it is 
due to our new size bound technique from Sect. 2-4 that we obtain polynomial 
runtime bounds in this example. In contrast, to the best of our knowledge, all 
other state-of-the-art tools fail to infer polynomial size or runtime bounds for 
this example. Similarly, if one modifies t4 such that instead of x1, x4 is decreased 
as long as x4 > 0 holds, then our approach again yields a polynomial runtime 
bound, whereas none of the other tools can infer finite runtime bounds. 


Finally, we state our completeness results for integer programs. For a set 
CCT and £, V € L, let ~c l hold iff there is a transition (@,_,_, 0’) € C. We 
say that C is a component if we have £ wd l for all locations £, @’ occurring in 


C, where mod is the transitive closure of ~ç. So in particular, we must also have 
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£ ~f £ for all locations £ in the transitions of C. We call an integer program 
simple if every component is a simple cycle that is “reachable” from any initial 
state. 


Definition 44 (Simple Integer Program). An integer program (V, L, lo, T) 
is simple if every component C C T is a simple cycle, and for every entry 
transition (_,-,-,£) € Ec and every oo € X, there is an evaluation (lo, o0) >> 


(£, 00). 


In Fig. 2, T \ {to} is a component that is no simple cycle. However, if we 
remove t3 and replace to’s guard by true, then the resulting program P’ is 
simple (but not linear). A simple program terminates iff each of its isolated 
simple cycles terminates. Thus, if we can prove termination for every simple 
cycle, then the overall program terminates. Hence, if after chaining, every simple 
cycle corresponds to a linear, unit prs loop, then we can decide termination 
and infer polynomial runtime and size bounds for the overall integer program. 
For terminating, non-unit prs loops, runtime bounds are still polynomial but 
size bounds can be exponential. Hence, then the global runtime bounds can be 
exponential as well. Note that in the example program P’ above, the eigenvalues 
of the update matrices of tı and t4 have absolute value 1, i.e., tı and t4 correspond 
to unit prs loops. Hence, by Theorem 45 we obtain polynomial runtime and size 
bounds for P’. 


Theorem 45 (Completeness Results for Integer Programs) 


(a) Termination is decidable for all simple linear integer programs where after 
chaining, all simple cycles correspond to prs loops. 

(b) Finite runtime and size bounds are computable for all simple integer pro- 
grams where after chaining, all simple cycles correspond to terminating prs 
loops. 

(c) If in addition to (b), all simple cycles correspond to unit prs loops, then the 
runtime and size bounds are polynomial. 


In the definition of simple integer programs (Definition 44), we required that 
for every component C and every entry transition (_,_,_,0) € Ec, there is an 
evaluation (lo,00) =>> (€,00) for every oo E€ X. If one strengthens this by 
requiring that one can reach £ from £o using only transitions whose guard is true 
and whose update is the identity, then the class of programs in Theorem 45 (a) 
is decidable (there are only n ways to chain a simple cycle with n transitions 
and checking whether a loop is a prs loop is decidable by Lemma 19). 


6 Conclusion and Evaluation 


Conclusion. In this paper, we developed techniques to infer size bounds auto- 
matically and to use them in order to obtain bounds on the runtime complexity 
of programs. This yields a complete procedure to prove termination and to infer 
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runtime and size bounds for a large class of integer programs. Moreover, we 
showed how to integrate the complete technique into an (incomplete) modular 
technique for general integer programs. To sum up, we presented the following 
new contributions in this paper: 


(a) We showed how to use closed forms in order to infer size bounds for loops 
with possibly non-linear arithmetic in Theorem 7. 

(b) We proved completeness of our novel approach for terminating prs loops 
(see Theorem 23) in Sect. 3. 

(c) We embedded our approach for loops into the setting of general integer 
programs in Sect.4 and showed completeness of our approach for simple 
integer programs with only prs loops in Sect. 5. 

(d) Finally, we implemented a prototype of our procedure in our re- 
implementation of the tool KoAT, written in OCaml. It integrates the compu- 
tation of size bounds via closed forms for twn-loops and homogeneous (and 
thus linear) solvable loops into the complexity analysis for general integer 
programs.’ 


To infer local runtime bounds as in Definition 41, KoAT first applies 
multiphase-linear ranking functions (see [5,19]), which can be done very effi- 
ciently. For twn-loops where no finite bound was found, it then uses the com- 
putability of runtime bounds for terminating twn-loops (see [17,20,27]). When 
computing size bounds, KoAT first applies the technique of [8] for reasons of 
efficiency and in case of exponential or infinite size bounds, it tries to compute 
size bounds via closed forms as in the current paper. Here, SymPy [30] is used 
to compute Jordan normal forms for the transformation to twn-loops. Moreover, 
KoAT applies a local control-flow refinement technique [19] (using the tool iRank- 
Finder [13]) and preprocesses the program in the beginning, e.g., by extending 
the guards of transitions with invariants inferred by Apron [24]. For all SMT prob- 
lems, KoAT uses Z3 [31]. In the future, we plan to extend the runtime bound 
inference of KoAT to prs loops and to extend our size bound computations also 
to suitable non-linear non-twn-loops. 


Evaluation. To evaluate our new technique, we tested KoAT on the 504 bench- 
marks for Complexity of C Integer Programs (CINT) from the Termination Prob- 
lems Data Base [35] which is used in the annual Termination and Complexity 
Competition (TermComp) [18]. Here, all variables are interpreted as integers over 
Z (i.e., without overflows). To distinguish the original version of KoAT [8] from 
our re-implementation, we refer to them as KoAT1 resp. KoAT2. We used the 
following configurations of KoAT2, which apply different techniques to infer size 
bounds. 


e KoAT2orig uses the original technique from [8] to infer size bounds. 
e KoAT2+SIZE additionally uses our novel approach with Theorem 7, 34, and 
38. 


T For a homogeneous solvable loop, the closed form of the twn-loop over A that results 
from its transformation is particularly easy to compute. 
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The CINT collection contains almost only examples with linear arithmetic 
and the existing tools can already solve most of its benchmarks which are not 
known to be non-terminating.* While most complexity analyzers are essentially 
restricted to programs with linear arithmetic, our new approach also succeeds on 
programs with non-linear arithmetic. Some programs with non-linear arithmetic 
could already be handled by KoAT due to our integration of the complete tech- 
nique for the inference of local runtime bounds in [27]. But the approach from 
the current paper increases KoAT’s power substantially for programs (possibly 
with non-linear arithmetic) where the values of variables computed in “earlier” 
loops influence the runtime of “later” loops (e.g., the modification of our example 
from Fig. 2 where t4 decreases x4 instead of xı, see the end of Example 43). 


Table 1. Evaluation on the Collection CINTT 


O(1)| O(n) | O(n?) | O(n??)|O(EXP)| <w | AVG*(s) | AVG(s) 
KoAT2+SIZE 26 | 233 (2)|71 (1) 25 (9) |3 (2) |358 (14) | 9.97 22.88 
KoAT2orig 26 |232 (1)/ 70 15 5 (4) | 348 (5) | 8.29 21.52 
MaxCore 23 |220 (4)|67 (1)| 7 0 317 (5) |1.96 5.25 
CoFloCo 22 |197 (1)| 66 5 0 290 (1) | 0.59 2.68 
KoAT1 25 |170 (1)|74 12 8 (3) | 289 (4) | 0.96 3.49 
Loopus 17 171 (1)/50 (1) 6 (1) l0 244 (3) |0.40 0.40 


Therefore, we extended CINT by 15 new typical benchmarks including the 
programs in (1), Fig. 2, and the modification of Fig. 2 discussed above, as well 
as several benchmarks from the literature (e.g., [3,6]), resulting in the collec- 
tion CINT*. For KoAT2 and KoAT1, we used Clang [11] and llvm2kittel [14] to 
transform C into integer programs as in Sect. 4. We compare KoAT2 with KoAT1 
[8] and the tools CoFloCo [15], MaxCore [2] with CoFloCo in the backend, and 
Loopus [33]. These tools also rely on variants of size bounds: CoFloCo uses a set 
of constraints to measure the size of variables w.r.t. their initial and final val- 
ues, MaxCore’s size bound computations build upon [12], and Loopus considers 
suitable bounding invariants to infer size bounds. 

Table 1 gives the results of our evaluation, where as in TermComp, we used a 
timeout of 5 min per example. The first entry in every cell denotes the number of 
benchmarks from CINT+ for which the tool inferred the respective bound. The 
number in brackets only considers the 15 new examples. The runtime bounds 
computed by the tools are compared asymptotically as functions which depend 
on the largest initial absolute value n of all program variables. So for example, 
KoAT2+SIZE proved a linear runtime bound for 231 + 2 = 233 benchmarks, 
i.e., re(o) € O(n) holds for all initial states where |o(v)| < n for all v € V. 


8 iRankFinder [13] proves non-termination for 119 programs in CINT. KoAT2orig 
already infers finite runtimes for 343 of the remaining 504 — 119 = 386 examples 
in CINT. 
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Overall, this configuration succeeds on 358 examples, i.e., “< w” is the number 
of examples where a finite bound on the runtime complexity could be computed 
by the tool within the time limit. “AVG*(s)” denotes the average runtime of 
successful runs in seconds, whereas “AVG(s)” is the average runtime of all runs. 

Already on the original benchmarks CINT, integrating our novel technique 
for the inference of size bounds leads to the most powerful approach for run- 
time complexity analysis. The effect of the new size bound technique becomes 
even clearer when also considering our new examples which contain non-linear 
arithmetic and loops whose runtime depends on the results of earlier loops in the 
program. Thus, the new contributions of the paper are crucial in order to extend 
automated complexity analysis to larger programs with non-linear arithmetic. 

KoAT’s source code, a binary, and a Docker image are available at https:// 
koat.verify.rwth-aachen.de/size. This website also has details on our experiments, 
a list and description of the new examples, and web interfaces to run KoAT’s 
configurations directly online. 
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Abstract. Many problems in mathematics and computer science 
involve summations. We present a procedure that automatically proves 
equations involving finite summations, inspired by the theory of holo- 
nomic sequences. The procedure is designed to be interleaved with the 
activities of a higher-order automatic theorem prover. It performs an 
induction and automatically solves the induction step, leaving the base 
cases to the theorem prover. 


1 Introduction 


Finite summations—that is, summations Pamti over finitely many terms t;— 
are ubiquitous in mathematics and computer science, but they are poorly sup- 
ported by automatic theorem provers. One reason is that summations are higher- 
order, whereas most theorem provers are first-order. 

In recent years, we have seen the rise of higher-order provers [2,3, 16-18]. 
With these provers, }7i"_,,t; can be represented as sum m n (Ai. ti); the tradi- 
tional J` syntax can be seen as syntactic sugar. But despite the use of heuristics 
(17, Sect. 4], higher-order provers are ill-equipped to reason inductively. A simple 
problem such as )>;_9i = n(n + 1)/2 is a formidable challenge for them, even if 
we include axioms for +, -, /, and X` together with an induction principle. 

In this paper, we introduce a procedure for proving such equations in a 
higher-order prover. The procedure is triggered by a proof goal of the form 
ky s +t = u, possibly with some conditions (Sect. 2). In a refutational prover, 
the equation would be negated, as ký `s + t # u, and would correspond to the 
negated conjecture, a problem axiom, or some clause derived by the prover. 

Our procedure translates facts about summations to linear recurrences. These 
recurrences have almost the same form as multivariate holonomic sequences [20], 
which, while not being a prerequisite for reading this paper, strongly inspired our 
work. Each recurrence is associated with a multivariate sequence—a, sequence 
with one or more indices. In this paper, the word “sequence” generally means 
“multivariate sequence.” 

The procedure has three steps. 
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1. Initialization (Sect. 3): Heuristically choose terms in the goal to generalize and 
perform induction on. Among the problem axioms, select those of a suitable 
form as initial recurrences for the procedure. 

2. Propagation (Sect. 4): From the initial recurrences, compute recurrences cor- 
responding to the goal. For +, -, and }> expressions occurring in the goal, 
recurrences are computed from the recurrences of their direct subexpressions. 

3. Induction (Sect. 5): If the final recurrences for the goal involve only the goal 
and no other sequences, use them for induction. If they make the difference 
of successive values of kẹ `s +t-— u constantly 0, this establishes the induction 
step. Then reduce the goal to a set of base cases and give these to the prover. 


Propagation and induction apply holonomic-style techniques almost as a black 
box. Initialization connects them to the overall proof search. 

For example, to prove )7;_,i = n(n + 1)/2, the procedure would transform 
the equation into recurrences and find out that the difference )>/_ i—n(n+1)/2 
remains constant as n increases, thereby establishing the induction step. If that 
difference is constantly 0, we get 7/_)i = n(n + 1)/2; in general, it suffices to 
prove a number of base cases, which are left to the prover. This example is very 
simple, but the procedure scales up to more sophisticated problems (Sect.6). An 
implementation is under way in the Zipperposition prover [17]. 

The procedure treats X` as an interpreted (built-in) symbol. The summation 
expression evaluates to a value in a commutative group, or a ring if ring multipli- 
cation is present. The commutative group or ring gives us +, -, and —. These are 
also interpreted, as are numerals. Integers, including indices, can multiply group 
elements. Based on the interpretation, we use the forms t = u and t— u = 0 
interchangeably. 

Compared with Wilf-Zeilberger pairs [19] and other methods (Sect. 7), the 
main benefit of our procedure is that it goes beyond holonomic sequences and 
supports both uninterpreted functions and an infinite number of base cases. Our 
procedure is widely applicable and may help prove not only difficult summations 
in a restrictive form but also easier summations in a more general form, which 
is useful in a general-purpose theorem prover. At the heart of our work is the 
novel combination of techniques from superposition and holonomic sequences, 
which is visible both in the prover integration (Sect.2) and in the computation 
of so-called excess terms (Sect. 4). We refer to our technical report [14] for more 
details. 


2 Inference Rule 


Our procedure can be integrated into a theorem prover, where it takes the form 
of an inference rule that complements the prover’s existing rules. Our technical 
report discusses an integration with satisfiability modulo theories (SMT) and 
tableaux; here, we present a rule for superposition: 


Ci G Civ tis] 40 


; = SUMMATION 
DVC V Vier tlb] #0 
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These side conditions apply: 


e ¢[3] is an expression that can be brought into the general form kX; mt +t”; 


e the procedure selects, generalizes, and performs an induction on the subterms 
5 of t (Sect. 3); 

e the procedure succeeds at proving the induction step based on initial recur- 
rences derived from C1,...,C; (Sect.3) and their propagation (Sect. 4); 

e the procedure identifies B as the finite set of base cases of the induction, 
where each case is a vector b of terms of the same length as 5 (Sect.5); and 

e the subclause D captures potential conditions determined by the procedure. 


The intuition behind the rule is that the conclusion should be easier to refute 
than the rightmost premise. As for the premises C1,..., Cı, they can contain 
useful information about §, often about bounds. 


3 Initialization 


The first step of our procedure is to recognize the structure of recurrences. Vari- 
ables on which we can perform induction appear as Skolem constants in the 
negated goal. Further opportunities for induction can be created by generalizing 
complex terms. Also as part of this step, we must choose which terms represent 
(multivariate) sequences and which clauses represent their recurrences. 


Theory Detection. We require the necessary theory of summation to be pre- 
defined. Specifically, this refers to the inductive theory of integers, axioms for 
commutative groups (including multiplication by integers), and the definition of 
summation from 0 by pre fn = 0 and so n = r ofn + fm+i even for 
negative m € Z. Other finite intervals than [0, m] are expressible as differences. 

Ring multiplication may be absent, so we do not take it as predefined. Instead, 
we search candidate binary operators from the negated goal. For each candidate, 
we can try to prove left and right distributivity by syntactically looking for that 
axiom or by running another instance of the prover. Distributivity is the only 
necessary property to apply the procedure, but associativity, commutativity, and 
the unit element can also be used in simplifications. 


Term Generalization. Term generalization transforms Skolem constants or 
complex terms into variables and then performs an induction on the variables. 
We propose a straightforward heuristic: For each nonnumeral subterm s of type 
Z occurring in the negated goal, generalize s if s stays variable-free even after 
recursively applying this heuristic on the proper subterms of s itself. For example, 
in the following variable-free integer terms, the underlined subterms would be 
generalized: a, 123, f 0 2, 2f (g (—1)) (—3a), f 1 (g (a + 1)), f (ga) 7a. 

Let 5’ = (s1,..., Sa) be the subterms chosen for generalization. Then, based 
on the negated goal C” V ¢[s'] # 0 (as in the SUMMATION rule), generalization 
sets up the goal Vi € N. t[ri] = 0 where N C Zf collects the bounds of § (often 
N = Nî). We try to prove this goal up to base cases and other mild conditions. 
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The generalization makes it possible to use induction to prove that the goal 
sequence term ¢[7]—a function of 7i—equals zero on N. We try to prove the 
generalized goal assuming ~C’ and some extra conditions E such as the base 
cases of the induction. Then, instantiating 7 := 5, we conclude C’V7EVt|s] = 0. 
This, together with the negated goal C” v t[s’] 4 0, implies a conclusion of the 
form C’ V AEF for the SUMMATION rule. Note that C” is not generalized. 

The set N embodies knowledge about § that we find among existing clauses 
C,,...,C; and the condition ~C”. The free variables of ~C” are interpreted as 
constants, and they can also occur in §. For example, assume that 5’ = f s’ and 
nm = n and that the generalized goal contains the factorial n!. Its recurrence 
must be in a conditional clause—e.g., (m+ 1)! = (m+1)m! V0 Z m. To use 
this recurrence for n!, we need n > 0, which we can ensure using N if we find a 
bounding clause f s’ > 0 or its generalization such as f m > 0 where m is a free 
variable. The more we know about §, the more recurrences we can get. At the 
same time, N must allow induction, so we keep it convex by considering only 
coordinatewise bounds of 5. 


Form of Sequence Terms. Sequence terms are terms of the underlying higher- 
order logic that our procedure can work with. From their structure, we distin- 
guish (pointwise) addition and multiplication, summation, and affine substitu- 
tion. This gives a first-order grammar to express the sequence terms. 


Definition 1. Sequence terms on a ring A are inductively defined as follows. 
The logic’s terms of type A with distinguished integer variables 7 are sequence 
terms. If fz and gz are sequence terms with d variables ñ, then so are fz + gr, 
fa Gn: Dio faei and ofa = fox where @ is a vector, a is an integer, 
Ce n = ani +--+ Cana, and g is an affine substitution (meaning om = qm + b 
for a matrix q and a vector b); a, the entries of €, and the entries of o (meaning 


the entries of q and b) must be numerals. 


Remark 2. In Definition 1 and in the sequel, a commutative group can be 
used instead of a ring if ring multiplication is absent. In this case, all formulas 
involving ring multiplication (e.g., fz - gz) should be ignored. 

We view sequence terms as functions Z? — A. We then write the sequence 
terms from the definition compactly as f + g, f-g9, X; f, and of, and call 
a= Ce +d an affine variable sum. Moreover, since -, Be and ø all distribute 
over +, we can write any sequence term as c1 f 14.. -+cp fE where the coefficients 
cj are numerals and the sequence terms f’ are distinct and do not contain +. 
Finally, we forbid variable shadowing: pe, binds nj, and while a g and 
E; g and other references to nj outside X; are syntactically valid, we avoid 
such forms by renaming them during encoding and never reintroducing them. 
Choice of Initial Recurrences. Semantically, the recurrences we look for 


are multivariate heterogeneous linear finite-fixed-step equations with polynomial 
coefficients. An archetypical example is 


(n? + 1) fn+2,m+1 + Mfn+1,m = nM fama = 2hn,m + (m rx n) hn,m+1 + 1 (1) 
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Here, the sequences f,h,1 are bivariate, and the sequence indices are all of the 
form n + k or m + k for numerals k € Z, amounting to finite fixed steps. 

The general form is 0 = Pig! +---+ Pyg* = Peg where 9 = (gt, 9") isa 
tuple of sequence terms and Pisa tuple of operator polynomials as defined below. 
If k = 1, we have a homogeneous recurrence of g1; otherwise, it is heterogeneous. 


Definition 3. Operator polynomials are a Z-algebra with composition as prod- 
uct (meaning closed under addition, composition, and integer multiplication) 
spanned by the multiplier and shift operators: 


e The multiplier operator M; of index j multiplies a multivariate sequence f 
by the variable n; of index j: (Mj f)g = nj; fa- 

e The shift operator Sj = {n; +> nj + 1} of index j increments the variable nj 
of index j of a multivariate sequence f by one: (Sj f) = finjon;+1}a- 

With d index variables, the operator polynomials look like ordinary polyno- 
mials Z[M,,..., Ma, S1,..., Sa], but the composition product is noncommutative 
since S;M; = M;S; + S; for all i = 1,...,d (a derivation of which is given 
in the next section directly above equation (2)). As an example of expressing 
recurrences in terms of operator polynomials, consider the previous archetypical 
recurrence (1). Taking n as the first and m as the second variable, the recurrence 
reads 


Remark 4. The expression Pe g identifying a recurrence is itself a sequence 
term. It suffices to observe that if f is a sequence term, then so are the substi- 
tution S;f and the product M,f = (it n,;)- f with the projection sequence 
term Ë= nj. 


As sketched in Sect. 1, we must select some of the problem axioms as initial 
recurrences for the procedure. This is accomplished as follows. Let there be an 
edge between two axioms of the form C V s = t (where C may be empty) if they 
both contain a top-level occurrence of the same sequence g, i.e., an occurrence of 
g that is not nested inside an uninterpreted function symbol. The axioms then 
form a graph. We take as initial recurrences the connected component of the 
generalized goal. 

By a sequence g, we mean the f & part of a term of the form f @ ñ where f is 
an uninterpreted function symbol, d is a tuple of variable-free terms, and ñ is a 
nonempty tuple of integer variables or affine (i.e., linear term + constant term) 
combinations of them. The tuples @ and ñ may in general be interleaved. 

In other contexts, an analogous step is known as lemma filtering or premise 
selection [4, Sect. 2]. Clutter from irrelevant facts is less of an issue in the context 
of our procedure because it can use only linear recurrences. Beyond this, our 
simple heuristic does nothing to avoid clutter. 

What should we do about conditions such as C in C V f @ n = t? We could 
forbid them and work only with unit equations such as f @ n = t. We could collect 
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them and put them in the D component of the SUMMATION rule’s conclusion. 
Or we could attempt to prove them when the initial recurrences are selected. In 
our ongoing implementation, we chose the first option, but what the best option 
is remains an open question. 


4 Propagation 


Holonomic sequences can be defined by homogeneous recurrences with polyno- 
mial coefficients and finitely many base cases. They are closed under the four 
operations that build sequence terms (+, -, De a), which especially makes their 
equality decidable [20]. The closure is realized by four procedures to derive recur- 
rences of a sequence term from the recurrences of its immediate subterms, which 
we call propagation. We can propagate independently of the base cases and hence 
work on nonholonomic sequence terms [6]. Although we expect the holonomic 
subcase to be decidable in our setting, in general decidable equality is lost. Addi- 
tionally, unlike in the holonomic setting, we allow heterogeneous recurrences. We 
will build this into our noncommutative Gröbner basis setup that is used in the 
propagation procedures. 


Grébner Bases of Recurrence Operators. A (generalized) Gröbner basis is 
a certain well-behaved generating set of a left-ideal of (possibly noncommutative) 
polynomials. Equivalently, we will view it as a system of polynomial equations 
that is complete for rewriting. Given a polynomial equation P = 0, for every 
monomial M we get a rewrite rule as follows. Decompose MP as MP = L+ R 
where L is the leading monomial of MP w.r.t. a fixed monomial ordering times 
its coefficient. Then L = —R gives rise to a rewrite rule L — —R. A system of 
equations is complete for rewriting if every one of its consequences can be proved 
via rewriting by these rules. 


Example 5. The system {ab? = a + b, a?b = a+ 1} does not prove its conse- 
quence a? = b by rewriting. (We can see that a? = b is a consequence by 
multiplying the first equation by a and the second equation by b and then by 
subtracting the two equations.) In the other direction, the system’s Grobner basis 
{a? = b, b? = a + 1} does give rewrite proofs ab? a a? +a TEN b+aand 


=a+1 
ab b? at+l. 
a?=b b?=a4+1 


A theory of Grébner bases exists for various polynomial algebras [10]. In 
our setting, a sufficient requirement is that all indeterminates X, Y commute up 
to lower-order terms: XY — YX € ZX + ZY + Z. The operator polynomials of 
Definition 3 fall into this category with the natural choice of taking all multiplier 
and shift operators as indeterminates. Indeed, for any sequence term f, we have 
the noncommutation relations 


(Si Myf) q = (G> njfa))a = (n + 1) (Sif) = (MGS; + Si) Az 


and all other pairs of multipliers and shifts commute exactly. That is: 


SiM; = M;Si + Ôi j Si SiS; = S55; M;,M; = M;M; (2) 
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for all i and j, where 6;,; equals 1 if i = j and 0 otherwise. When we consider a 
formal polynomial algebra (necessary to perform Grébner basis computations), 
we will usually mean polynomials with integer coefficients and indeterminates 
Mı, Mo,...,51,S2,... satisfying (2). Exceptionally, when we use propagation to 
substitution, we will consider compositions of shifts formally as further individual 
indeterminates, as explained above Procedure 12. Apart from this exceptional 
setting, we fix a choice of monomials as follows. 


Definition 6. In our setting, a monomial is a polynomial of the form My'--- 
Mgt ST -- -934 where the exponents xj, yj E€ N are numerals. 


Due to the (non)commutation relations (2), polynomials can be written as 
sums of monomials times their integer coefficients. This makes working with 
these noncommutative polynomials similar to working with commutative ones. 
A major difference is that monomials are not closed under product, as illustrated 
by Sı:Mı = M,S,+5}. This complicates the definition of monomial order below, 
which in turn defines how to interpret a polynomial equation as a rewrite rule. 


Definition 7. A monomial order x is a well-founded total order on monomials 
such that for all monomials A, B,C, if A x B, then the leading monomial of CA 
is -smaller than the leading monomial of CB; here, the leading monomial of a 
nonzero polynomial P means the <-largest monomial occurring in P. 


Buchberger’s algorithm to compute Grobner bases (also in a noncommutative 
context) is similar to saturation-based theorem proving. It repeatedly derives 
from polynomial equations P = 0 and R = 0 new equations AP— BR = 0 where 
coefficient-monomial products A, B make the leading monomials of AP and BR 
cancel. It suffices to take A, B with smallest total degree and coprime coefficient. 
A and B play a similar role to the most general unifier in superposition. Since S; 
is semantically bijective, we can and always do cancel it, replacing S;R = 0 by 
R = 0. This modified completion into a Gröbner basis always terminates. The 
standard termination proof reduces to applying noetherianity of commutative 
polynomials over Z or Dickson’s lemma [10]. 

A single operator polynomial P) perfectly encodes a linear homogeneous 
recurrence 0 = Pg of a sequence term g. However, we allow any heterogeneous 
recurrence of the form 0 = Pe f = Pf! +--+ Px ft where f = (fri sa) 
is an arbitrary tuple of different sequence terms. We can encode this by a sin- 
gle operator polynomial for the duration of one Gröbner basis computation as 
follows. Let f enumerate exactly once all the sequence terms needed to express 
the current recurrences with the help of operator polynomials. Let f depend on 
d variables. For each f’, we consider a shift Fj := Sap; w.r.t. a so far unused 
variable. Then the operator polynomial Pe F encodes 0 = Pe f. 

This encoding does not respect the semantics of operator polynomials; to 
recover it, we must apply the substitution {F > À. However, products such 
as FF) remain uninterpretable even with ring-valued sequences because the 
operator product—function composition—is different from multiplication of f?’s. 
Hence, we will simply discard uninterpretable polynomials after the Gröbner 
basis computation. Moreover, from now on, we will freely write fÍ for F fe 
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Definition 8. Let X,,...,X, be an enumeration of all multiplier and shift 
indeterminates. An (X1,...,X,)-elimination order is a monomial order such 
that Xj > X,44'---X 4» for all indices j < k and all exponents ay41,-..,dn E€ N. 


Our default choice for the order is to compare total degree in Xj,...,X, and 
break ties using the total degree reverse lexicographical order [7, Chapter 2 §2]. 


Procedure 9. Eliminating indeterminates X1,...,X; from a finite system of 
equations E means computing a Grébner basis G of E w.r.t. an (Xy,...,X%)- 
elimination order and then discarding all polynomials from G that contain any 
of X,...,Xx or that are not linear in the indeterminates encoding sequence 
terms. (As mentioned above, during the Gröbner basis computation, whenever 
we derive a polynomial Sj R, we replace it by R.) 


While in principle any Grobner basis would suffice for elimination, our default 
choice is to compute the reduced Grébner basis (i.e., the fully simplified one). 
The nonlinear polynomials can be discarded as soon as they are derived during 
the Grobner basis computation instead of only at the end. Recurrence equations 
produced by elimination are logical consequences of the input equations, as we 
explain in our technical report. 

Despite the formally equivalent roles of all sequence terms ft in the recurrence 
0=Pe f, we associate with every recurrence a sequence term fÍ. It is often 
convenient to write such a recurrence of fÍ as Pj f?-++e = 0 where the excess terms 
e= Pef. — P; fÍ contain all sequence terms f’ except fÍ. The choice of fÍ among f 
will be determined by the definition of excess terms (Definition 18). However, this 
choice remains irrelevant for the individual propagation steps, described below. 
We adapt these steps from the four closure properties of holonomic sequences 
by carrying excess terms along. 


Propagation to Addition. Let us start with addition of sequence terms. 


Procedure 10. Let f and g be sequence terms, and let h be the formal name of 
their addition f +g. The associated recurrences F of f and G of g are propagated 
to those of h by eliminating f and g from FUGU {h = f + g}. (By Procedure 9, 
this involves computing a Gröbner basis for these equations and then discarding 
the equations containing f or g as well as the corresponding nonlinear terms.) 


Actually, the same propagation technique works if f + g is replaced by any 
expression in the general recurrence format P el (a dot product of operator 
polynomials P and sequence terms Ü. The key is that the defining equation 
h= Pelis again a linear recurrence. Such propagations could also be done by 
iterating more primitive propagations. 


Example 11. Consider the goal i025 = gn + a given go = 0 and gn+2 = 
Gn t+@Gn41+Gn42 for all n € N. The defining recurrence of g can be written using 
the operator polynomials as $7g = g+$,a+.S$?a. The defining recurrence of the 
sum fn := i045 is S$, f = f + Sia. We must prove that hn := gn + ao — fn 
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is 0. To achieve this, we propagate recurrences to h using the elimination proce- 
dure described above (Procedure 9) and the total-degree-based (f, g)-elimination 
order with f < g. Leading monomials are shown in bold: 


0 = 8?g — g — Sia — S?a recurrence of g 

-9% 0=g+a-f-h definition of h 
0 = —g — Sa — Sa — ao + SÈ f + S?h 

— Sı 0= Sıf— f -— Sia recurrence of f 
0 = -g — S1a — ao 4 S?h Sif 

— 0 = Sif — f — Sia recurrence of f 
0 = -g — a + Sh + f 

+ 0O=gta—f—-h definition of h 
0= S?h—-h 


In this example, h,+2—h, = 0 is the only recurrence that does not contain f and 
g, so we discard the rest of the Grobner basis calculation. Since hn+2 — hn = 0 
contains only the sequence h, we can use it to prove the induction step (of size 
2) of a proof of Vn. hn = 0. We are then left with the two base cases ho = 0 and 
hı = 0, which the SUMMATION inference would include in its conclusion without 
auxiliary symbols (f and h) as eee F go + ao V See F git ao. 


Propagation to Substitution. Consider a numeral matrix a = [ax,]),; € PIR 
and a vector b € Z". They characterize an affine substitution o = {ñ > añ + 5} 
= {nk > Daakini +b, | 1 < k < d}. As an operator on sequences, o performs 
an affine change of variables: (of); = fañ+5- 

Clearly, any recurrence Pf = 0 of f implies oP f = 0. Moreover, if oP = P'o, 
then P'o f = 0 gives a recurrence of o f. Finding such a P’ for a general P can be 
reduced to finding an operator polynomial P% satisfying oX = Po for every 
indeterminant X. This amounts to pushing all indeterminates X leftwards. For 
multipliers, we have o(M,..., Ma) = (a(M1,..., Mp) + b) o. In contrast, shifts 
are easily pushed only rightwards—namely, Sjo = o 81" --- S¢”. Consequently, 
the recurrences of f must be first expressed in terms of the composite shifts 
S; := S1” --..95. As operators, these satisfy the (non)commutation relations 


Sj; Mk = (Mk + akj) Sj SiS; = SjSi SiS; = S;jS; (3) 


This makes the S;’s suitable as indeterminates in Gröbner basis computations. 

Accordingly, for propagation to substitution, we enlarge our formal polyno- 
mial algebra to also contain the indeterminates S),So,... satisfying the rela- 
tions (3), while also keeping (2). We note that, as operators, the indeterminates 
further satisfy (essentially by definition) the relations 


S; Theagee grm = Lk. akj>0 Sp” for j € {1L;--D} (4) 
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We add these new relations to the system of recurrence equations of which we 
compute the Grobner basis. Finally, we extend our notion of monomial from Def- 
inition 6 to mean any polynomial of the form My! --- MJ SP --.S54S7!--- Si 
where the exponents xj, y;,z; E N are numerals. 


Procedure 12. Recurrences of a sequence term f are propagated to its affine 
substitution (of), = fan+¢ as follows. Eliminate each Sẹ from the system 
of polynomial equations containing both the recurrences of f and the rela- 
tions (4). Every resulting recurrence P(M,S)f +e = 0 implies a recurrence 
P(aM + b, 5 S)of + oe = 0 of of where we have collected the indeterminates into 
vectors and where e are excess terms that do not contain f. 

Example 13. Consider ome (ea) = F,,41 where the Fibonacci numbers 
are defined by Fy = Fy = 1 and (S?-—S,-—1)F = 0. For the binomial 
coefficient Chan = Ga) = PICEA 
gle reads as (S1S2— S2-— 1) () = 0 and extends (i) from 0 < ng < m 
to all no € Z and nı € N. Moreover, we have Ca = 2 (m-i) i.e., 
((M2 + 1) S152 — Mı — 1) () = 0. We want to propagate these recurrences to 
the substitution o = {n1 > nı, na ng — nı}. We have Sio = oSı8S3' and 
S20 = S2. So we introduce for S183 1 and Sə the indeterminates S4 and S2 
whose characterizing recurrences (4) read 


(S152 - 51) ()=0 (i) (S2— $2) ()=0 (i) 


the recurrence from Pascal’s trian- 


Next, we eliminate $1, S2 in favor of S1, S2. Here, (ii) immediately rewrites every 
S2 to Sp and then (i) becomes (—S; + S152) C) = 0, which rewrites every S4. 
The remaining steps to complete a Gröbner basis w.r.t. some total-degree order 
are irrelevant for what we want to illustrate. We factor the result for readability: 


(—$, +SıS2) () = 0 (—S2 + S2) (‘) = 
(S183 - S2- 1) () = 0 ((Mz + 1) S2 — Mı + M2) ( 

((Mı + 1) SiS2 — (Mı — Mə + 2) Sı — Mı — 1) (; 

((Mı — Mz + 1) (Mı — Mz + 2) Sı — Mı Mə — M2) (; 


a 
= 
= 


Now o maps the lowest four recurrences to recurrences of fri n. = TER 
below: 
(S153 — S2—1)f=0 ((Mz — Mı +1) S2- 2Mı + M2) f = 0 


((Mı + 1) SiS (2M, M24 2) Sı — Mı 1) f =0 
((2Mı — Mz + 1) (2Mı — Mə + 2) Sı — (Mı + 1) (Mz — Mı))f = 0 


The next step is to propagate to the summation. We postpone it to Example 16. 
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Propagation to Product. Let - be ring multiplication or more generally a 
group bihomomorphism. If the sequence terms f and g depend on disjoint sets 
of variables, recurrences of fg = f -g are essentially a union of recurrences 
of f and g. Namely, let Pf + e = 0 be any recurrence of f where P is an 
operator polynomial on the variables of f and the excess terms e do not contain 
f. Then P(fg) + eg = 0 because g is effectively a constant to P, and similarly 
for recurrences of g. With the help of this special case, propagation to product 
can be reduced to propagation to substitution, as explained below. 


Procedure 14. Let f and g be sequence terms parameterized by the variables 
n= (n) . Let m= (nja) be a tuple of fresh variables. The recurrences 
of f and g mé propagated to their pointwise product fg in two steps. First, the 
recurrences of the variable-disjoint product fzgm, are the union of the recurrences 
of fz multiplied on the right by g, and of those of gẹ, multiplied on the left by 
fa. Then the recurrences of faga = {mMm +> 7} (fagm) are found by propagating 
to substitution using Procedure 12. 


Propagation to Summation. We finally consider the summations 2 fa- 
We can assume that the variables are numbered so that the sum acts on the first 
two. Similarly to above, we consider the consequence Xp? —oP fa += oez = 0 
of a recurrence Pf + e = 0 of the sequence term f where P is an operator 
polynomial and e are excess terms. We want to find an operator polynomial P’ 
such that }°7°_)P becomes P'S n?o up to excess terms. Like for substitutions, 
finding such a P’ for P can be reduced to finding an operator polynomial P% 
satisfying Jp -0X = Pyn =o up to excess terms for every aa X. 


The result will be a recurrence P'S pofa +e = 0 of 0" 


n= -of 

Procedure 15. Recurrences of a sequence term f are i RA to its sum 
Dea ofz as follows. First, eliminate multipliers W, from all recurrences of f. 
Every resulting recurrence Pf +e = 0 implies pane oP ft + pois _oen = 0. Here, 
P is an operator polynomial that does not contain M1, and the excess terms 
e do not contain f. Next, each of these recurrences is rewritten into the form 
Ps ofz + Eo + Ena + ge -o£z = 0 where P’ is an operator polynomial and 
the Em’s are part of excess terms built by applying some operator polynomials 
and the substitution {nı ++ m} to f. This is achieved by commuting }7/°_, 
with indeterminates other than Sı and S2. These two indeterminates are ad 
handled by 


Dn 08197 = = Do oga T {nı > N24 1} ga {nı > 0} Jr 
Dnr o08297 = a SÐ n= oga — S2 {n1 > no} gr 


Example 16. Let us continue the proof of 5>? —o (2, ) = Frno+1 from Exam- 


nı=0 n2—nı 

nı 
n2—nı 
(S183 — S2—1)f = 0. It is actually the only recurrence after eliminating M; as a 
first step of propagation to summation. Next, we set Sı to 1 using a telescoping 


identity: 
Den 95193 f = ee oSof + {n1 > n} SFF — {ni 0} SSF 


ple 13. There we found for the summand frnino = ( ) a recurrence 
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Then we push the remaining shifts S2 leftwards: 


n= (53 ~ 5a) f = SoS ony ao (Sa — 1) f — S2 {m1 = no} (92-1) f 
= (53 — S2) rof — S3 {m > n} f — S2 {mi > na} (S2—1) f 


Hence, in total we have 


Eno (S182 — S2 —1) f — (93 - 92 — 1) En?-0f 
= {nı ze na} S153 f tar > 0} SŽ f — SZ tni: > n2} f — So {ra = n2} (S2 -1)f 
= ("3") ad Cae = (123) a a) 


a a eye =o 


Since (S183 — S2 —1)f = 0, we have ($3 — S2 — 1) 07?_»f = 0. Now this is the 
same recurrence that F;,,41 satisfies and hence the final propagation to difference 
gives (SF — S2 — 1) (O0?_of — Fno+1) = 0. This proves an induction step of size 
2 and leaves two base cases that can be discharged by a theorem prover. 


Iteration on Excess Terms. Let g be the term from the negated goal to be 
proved to be 0. After propagating along the structure of g, we end up with 
recurrences of the form Pg = e where P is an operator polynomial and the 
excess terms e do not contain g. In the holonomic case, e will be syntactically 
0. We have also observed that e is often 0 in the nonholonomic case as well. 
But if e is not syntactically 0, then Pg = e cannot immediately be used for a 
proof by induction. A solution is to iterate a full series of propagations with e in 
place of g to find Pye = e2 and conclude P;Pg = Poe = e2, then repeat as long 
as necessary. This process will always terminate, although it might fail to find 
recurrences. 

We will impose an order on the sequence terms to accomplish three things. 
First, we get a proper definition of which terms in a recurrence are excess. 
Second, well-foundedness of the order will guarantee termination of the iter- 
ation of full propagations to excess terms. Third, the iterations can be inter- 
leaved with basic normalizations such as {n > 2n1} Mı {n1 3ni4+1} f > 
2M, {ni Fe 6n1+2} f. 


Definition 17. The spine of a sequence term f without addition, denoted by 
spine f, is the sequence term obtained intuitively by erasing operator polynomials 
from f. Precisely, this means fully reducing f by the rewrite rules at — t, 
M;t > t, {i bri + at > {r = dri}t, and ye = Di where a, €, and 
the matrix b are all numeric. 


Shift indeterminates mix with other substitutions, which explains the last 
two rules. For example, spine {n1 +> 2n1} Mı {ny 3nı +1} f = {n1 2n1} 
{nı ++ 3nı} f. If we have a sequence term cig! + --- + cg? with addition, it 
contains multiple spines, one for each g. The significance of spines is that when 
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we derive a more complex consequence from a recurrence (during elimination by 
applying an operator polynomial to it), its spines do not become more complex. 

We can easily describe how each propagation step changes the spines of the 
involved sequence terms. Propagation to the addition f +g produces only spines 
ey and eg in the resulting recurrences, where es denotes a spine of a term from 
a recurrence of f and analogously for e,. Moreover, propagation to the substi- 
tution of produces ces, propagation to the product fg produces (spine f)e, 
and er (spine g), and propagation to the summation Sal produces {n; +> 0} 
(spine f), {n1 +> n2} (spine f), and }°7*_jes, where ef and eg are as above. 

We want propagations to preserve the invariant that excess terms are small. 
Given how spines change under propagation, a term order on spines offers a way 
to define smallness. We choose an order that also orients simplifications. 


Definition 18. Fix a Knuth—Bendix order with argument coefficients [12] with 
exactly three weights Wro > W (-) > 3Wo > 0 and all argument coefficients 
set to 2. Moreover, projection sequence terms corresponding to M,’s (Remark 4) 
must have equal weights, and substitutions with fewer bindings must have 
lower precedence. The excess (partial) order on addition-free sequence terms is 
obtained by comparing the spines of terms using this fixed order. Excess terms 
of a recurrence are all its nonmaximal sequence terms w.r.t. the excess order. 


The weights for the excess order are arranged to be compatible with nor- 
malization, which pushes substitutions to the leaf nodes of the term tree and 
pulls summations towards the root. The resulting normal form is simply the 
typical way of writing terms without explicit substitutions. It is also the nor- 
mal form of the rewrite system consisting of the applicable associativity and/or 
commutativity rules of - as well as the following rules: 


a: jt jst 1t,t1,{}t—t 
(xis) ‘to jst (aU {nj a}) u > ou 
ao) t>}; ot (oU{n; =a} M; >a 
Ept > {nj 0}t+ + {nj c}t a (ts) > ot- os 
bjt {n; => —-l}t—----—{nj ro 1-c}t aa't—(coo')t 


where s,t,u are sequence terms, u does not contain the variable nj, a is an 
affine variable sum, M; = 7i +> n, is a projection sequence term, g, g’ are affine 
substitutions, and the numeral c is nonnegative. 

These rules produce additions, which must be interpreted as follows. For any 
rule above of the general form tg — city +--+: + Cktk, the actual rewrite on the 
level of entire recurrences is f [to] +R = 0 > cf [ti]+---+cef [tk] +R = 0 where 
cj are numerals, the sequence terms f [t;] are equal except for the distinguished 
subterm tj, and R is the sum of the remaining terms in the recurrence. 

To conclude termination, it suffices to prove that to dominates each of t),..., 
tx individually. The proof is in our technical report. It makes apparent our choices 
of weights and argument coefficients for the transfinite Knuth—Bendix order. 
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5 Induction 


After propagation, we consider all recurrences Pg = 0 of the goal sequence 
term g to be proved to be 0. In exceptionally fortunate cases, the operator 
polynomial P is +1 and we are unconditionally done because, for any group, the 
multiplication-by-+1 map is invertible. This happens when the objective is to 
prove a recurrence that this method derives as a substep anyway. Otherwise, we 
apply induction and leave as conditions the base cases as well as invertibility of 
the multiplication maps associated with the leading monomials’ coefficients. 

A common case is that variables se over natural numbers and we have a 
final recurrence with leading shift S7 bi -Sa ba Ww.r.t. any monomial order. Then the 
values Us {RENI |n <b} as for the base cases, as a union of aL ae 
hyperplanes that is infinite nies d < 1, but it corresponds to only Di 
one-variable substitutions {n; — a} for 1 < j < d and 0 < a < bj. If our a 
generalization produced variables that do not participate in their induction (ie., 
their b;’s are 0), they are replaced back to their original values. 

If there is more than one applicable final recurrence, we take the intersection 
of their base value sets w.r.t. the same monomial order. To see that it works, 
consider any point outside the intersection. It is a nonbase point w.r.t. some final 
recurrence and hence the induction step can be taken by the recurrence. 

To represent the intersection as substitutions, we distribute it over the hyper- 
plane_ stack unions. This results in a union of hyperline stacks of the form 

N(J,b) : = {ñ EN? | nj <b; for all j € J} where J C {1,...,d} and b vary. 
One sich stack is represented by Į [jey b; substitutions {nj > a; | 7 € J} where 
the a;’s are chosen arbitrarily such that 0 < aj < bj. Unfortunately, distribution 
duplicates some base cases. To compensate, if I C J and b = € pointwise, then 
N(I,b) 2 N(J,@), so that N(J,@) can be removed in favor of N(I,b). 

If a variable n € Z is unbounded, we perform two inductions on the rays: 
0<nandn < bif b base cases are needed. The backward induction on n < b can 
be transformed into an induction on N by the change of variables n +> b—1—n. 


6 Examples 


Our procedure can prove the induction step of holonomic sequence formulas such 
as Example 13, the binomial formula: 


(a+)" = Di nce Jarot” ("3") = ee (1) (nn) 


Heterogeneous recurrences, which are beyond the holonomic fragment, enable 
proving elementary general sequence formulas such as Example 11 and the fol- 
lowing: 


I I k h k k 
pa = pane n=02on=0fh;n = DD ae 


If we ignore the holonomic base case requirements, we can for example prove 
the induction steps of Abel’s binomial formula and of some Stirling number 
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identities: 


(a+b)? = Eio (Halan) "(btn)" nk /ht = Eto {E} /(h—n)! 


Here, the Stirling numbers of the second kind {*} are one of many special non- 
holonomic sequences that frequently arise in combinatorics. They count the num- 
ber of partitions of a k-element set into n subsets. 

As further demonstration, we apply our procedure to the last equation. For 
convenience, we will use the name of a variable also to denote its multiplier 
operator. Moreover, we will use the uppercase version of the name of a variable 
to denote its shift operator. The defining recurrence of the Stirling numbers 
then reads (KN — (n+ 1) N — 1) {F} = 0 for k,n > 0, where K and N denote 
the shift operators for the variables k and n, the first n denotes the multiplier 
for the variable n, and the second n is the variable itself. This recurrence is 
complemented by the initial values {8} = 1 and {6} = {9} = 0 ifn #0. 

Starting from the right, the inverse m!~+ of the factorial satisfies the recur- 
rence (mM + M — 1)m!~' = 0 that holds for all m € Z by extension. This must 
be found in the initialization step because there is no propagation to division. 
Propagation to the substitution {m+ h — n} then gives the following recur- 
rences, factored for clarity: 

((A—n+1)H-1)(h—n)!-'=0 (N-—Ah+n)(h—n)!"'=0 

To propagate to product, we consider {#1} and (hg — nz)!~! with variables 
renamed apart. We must propagate to the substitution {nj œ n, hj  h, 
kj ++ k | j € {1,2}} the recurrences of {*1} (hg — ng)!~1 given by the following 
five operator polynomials: 


KıNı —(m1+1)Ni-1 and Hı —1 from {ki} 
(hg — nə + 1) Hə — 1, Nə — ho + n2, and K2—1 from (ha — n2)!7} 


We added here the trivial recurrences given by Hı — 1 and Kə — 1 implied by 
the independence from hı and kp. Among the defining recurrences (4) of the 
compound shift indeterminates N, H, K, the recurrence Hı Hə — H simplifies to 
Hə — H by Hı — 1 and Kı Kə — K to Kı — K by Kə — 1. (In other words, the 
factorwise renaming of already disjoint variables h and k amounts to renam- 
ing in the entire product.) The third compound shift recurrence, Ni No — N, 
simplifies to (hg — n2) Ni — N by Nə — hə + ng. The part of the Gröbner 
basis with only compound shifts is then straightforwardly finished with the 
result {KN — (nı +1) N — hə + n2, (ho — n2 + 1) H — 1}. Hence this propaga- 
tion step yields 


(KN- (n+1)N-h4n) 7 <9 (h—=n+1yH-1) Gl <9 


To sum over n, we first eliminate n from the previous two recurrences and 
conclude (H (K —h) + (N — 1) (KH —(h4+1)H4+1))({*}/(h—7n)!) = 0. The 
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sum has natural boundaries, meaning that the summand vanishes outside them. 
This guarantees that there will be no excess terms, which we also tediously 
discover when pulling out the indeterminates: 


P 
z 7 fi fi a _ { fat {E} 
Pie (h+1)H4 2 er S iy aa 


= — {FEM /(h+1)!+ {5} (R41) H-1) =0 
n=0 (h= n)! n=o(h =n)! (=1)! 
Here, by the recurrence of the inverse of the factorial, we get (—1)!~! = 0. So we 
obtain a recurrence H (K — h) poe {£} / (h — n)! = 0 for the left-hand side of 
our goal. For the right-hand side, we unproblematically obtain (K — h)(h*/h!) = 
0. Hence H (K — h) zeros out the difference h* /h! — Ela {£} / (h — n)!. The 


largest shift HK of the operator H (K — h) determines that the two sets of base 
cases h = 0 and k = 0 are sufficient for induction. 


7 Related Work 


Holonomic sequences [20] are closely related to our work. Unlike our approach, 
which allows infinitely many base cases as long as they are finitely representable 
(Sect. 5), they are limited to a finite number of base cases. Relaxing this limita- 
tion yields approximately the homogeneous version of our propagation procedure 
(i.e., without excess terms), whose theory Chyzak, Kauers, and Salvy laid out 
[6]. Heterogeneity amounts to module Gröbner bases [5,8,13]. Its integration into 
propagations makes elementary identities about general sequences automatically 
provable, which may be of interest for general-purpose theorem provers. 

In practice, hypergeometric sums are common holonomic sequences that 
have much faster algorithms available. Gosper’s indefinite summation [9] can 
be applied to compute Wilf-Zeilberger pairs [19], which offer compact proof cer- 
tificates for definite sum identities. These fast methods admit generalizations to 
the full holonomic setting. See Koutschan’s thesis [11] for an overview. 

Finding a closed form instead of only checking it for a summation is a different 
but related task. A common approach is to perform a recurrence solving phase 
after recurrence computation, as in the Mathematica package Sigma [1,15]. 


8 Conclusion 


We presented a procedure for proving equations involving summations within an 
automatic higher-order theorem prover. The procedure is inspired by holonomic 
sequences and partly generalizes them. It expresses the problem as recurrences 
and derives new recurrences from existing ones. In case of success, it shows the 
induction step of a proof by induction, leaving the base cases to the prover. 

As future work, we want to continue implementing the procedure in Zip- 
perposition [17]. We hope that the subsequent practical experiments help us to 
settle how side conditions of initial recurrences ought to be handled. 
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Abstract. We prove the correctness of invertibility conditions for the 
theory of fixed-width bit-vectors—used to solve quantified bit-vector for- 
mulas in the Satisfiability Modulo Theories (SMT) solver cvc5— in the 
Coq proof assistant. Previous work proved many of these in a com- 
pletely automatic fashion for arbitrary bit-width; however, some were 
only proved for bit-widths up to 65, even though they are being used to 
solve formulas over larger bit-widths. In this paper we describe the pro- 
cess of proving a representative subset of these invertibility conditions 
in Coq. In particular, we describe the BVList library for bit-vectors in 
Coq, our extensions to it, and proofs of the invertibility conditions. 


1 Introduction 


Many applications in hardware and software verification rely on bit-precise rea- 
soning, which can be modeled using the SMT-LIB 2 theory of fixed-width bit- 
vectors [3]. While Satisfiability Modulo Theories (SMT) solvers are able to reason 
about bit-vectors of fixed width, they currently require all widths to be expressed 
concretely (by a numeral) in their input formulas. For this reason, they cannot 
be used to prove properties of bit-vector operators that are parametric in the 
bit-width, such as the associativity of bit-vector concatenation. Proof assistants 
such as Coq [25], which have direct support for dependent types, are better 
suited for such tasks. 

Bit-vector formulas that are parametric in the bit-width arise in the verifica- 
tion of parametric Boolean functions and circuits (see, e.g., [13]). In our case, we 
are mainly interested in parametric lemmas that are relevant to internal tech- 
niques of SMT solvers for the theory of fixed-width bit-vectors. These include, for 
example, rewrite rules, refinement schemes, and preprocessing passes. Such tech- 
niques are developed a priori for every possible bit-width. Meta-reasoning about 
the correctness of such solvers then requires bit-width independent reasoning. 

In this paper, we focus on parametric lemmas that originate from a quantifier- 
instantiation technique implemented in the SMT solver cvc5 [2]. This technique 
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is based on invertibility conditions [15]. For a trivial case of an invertibility 
condition, consider the equation x + s = t. where z, s and t are variables of the 
same bit-vector sort. In the terminology of Niemetz et al. [15], this equation is 
“invertible for x.” A general inverse, or “solution,” is given by the term t—s. Since 
there is always such an inverse, the invertibility condition for x + s = t is simply 
the universally true formula T. The formula stating this fact, referred to here as 
an invertibility equivalence, is T = Jx. x + s = t, which is valid in the theory of 
fixed-width bit-vectors, for any bit-width. In contrast, the equation x -s = t is 
not always invertible for x. A necessary and sufficient condition for invertibility 
in this case was found in [15] to be (—s | s) & t = t. So, the invertibility 
equivalence (—s | s) & t =t = Jx. «-s=t is valid for any bit-width. Notice 
that the invertibility condition does not contain x. Hence, invertibility conditions 
can be seen as a technique for quantifier elimination. 

In [15], a total of 160 invertibility conditions were provided. However, they 
were verified only for bit-widths up to 65, due to the reasoning limitations of 
SMT solvers mentioned earlier. Recent work [16,17] addresses this challenge by 
translating the invertibility equivalences to the combined theory of non-linear 
integer arithmetic and uninterpreted functions. This approach was partially suc- 
cessful, but failed to verify over a quarter of the equivalences. 

We verify invertibility equivalences proposed in [15] by proving them interac- 
tively in Coq. From a representative subset of the invertibility equivalences, we 
prove 19 equivalences, 12 of which were not proven in [16,17]. For the remain- 
ing 7, that were already proved there, our Coq proofs provide more confidence. 
Our results offer evidence that proof assistants can support automated theorem 
provers in meta-verification tasks. To facilitate the verification of invertibility 
equivalences, we use a rich Coq library for bit-vectors, which is a part of the 
SMTCogq project [10]. This Coq library models the theory of fixed-width bit- 
vectors adopted by the SMT-LIB 2 standard [3]. For this work, we extended the 
library with the arithmetic right-shift operation and the unsigned weak less-than 


Table 1. The signatures 7; and Xo with SMT-LIB 2 syntax. Xı consists of the oper- 
ators in the entire table. Xo consists of the operators in the upper part. 


Symbol SMT-LIB Syntax Sort 

— = =, distinct Oin] X Om) — Bool 
<u; >u, Su; >u | bvult, bvugt, bvule, bvuge Om) X Om) — Bool 
~, — bvnot, bvneg Oin] > Tfn] 

&, |, <, >>, a | bvand, bvor, bvshl, bvlshr, bvashr afin] X Ofin] > Fin] 

+ bvadd Oin] X Oin] — Fn] 
Kg) ss, Sey ee bvslt, bvsgt, bvsle, bvsge Om) X Om) — Bool 

-, mod, + bvmul, bvurem, bvudiv Oin] X Fn] > Fn] 

o concat Oin] X Olm] — F[n+m] 
[u : l] extract Oin] `> O[u—1+1] 
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and greater-than predicates. To summarize, the contributions of this paper are as 
follows: (i) a description of the SMTCogq bit-vector library; (ii) extensions to the 
signature and proofs of the library; and (iii) formal proofs in Coq of invertibil- 
ity equivalences. These contributions, while important in their own right, have 
the potential to go beyond the verification of invertibility equivalences. For (i) 
and (ii), we envision that the library, as well as its extension, will be useful for 
the formalization of other bit-precise reasoning mechanisms, especially related to 
SMT, such as rewriting rules, lemma schemas, interactive verification, and more. 
For (iii), invertibility conditions are primarily used for quantifier instantiation 
(see, e.g., [15]). We hope that the increased confidence in their correctness will 
encourage their usage in other contexts and in more solvers. Further, the for- 
mal proofs can serve as guiding examples for other proofs related to bit-precise 
reasoning. 

The remainder of this paper is organized as follows. After technical pre- 
liminaries in Sect.2, we formalize invertibility conditions in Sect.3 and discuss 
previous attempts at verifying them. In Sect. 4, we describe the Coq library and 
our extensions to it. In Sect. 5, we discuss our Coq proofs. We conclude in Sect. 6 
with directions for future work. A preliminary version of this work was presented 
as an extended abstract in the proceedings of the PxTP 2019 workshop [11]. The 
current version is more detailed and complete. In particular, the one Coq proof 
that was missing in [11] is now completed. 


2 Preliminaries 


2.1 Theory of Bit-Vectors 


We assume the usual terminology of many-sorted first-order logic with equality 
(see, e.g., [12]). We denote equality by =, and use x # y as an abbreviation 
for ~(x = y). The signature Xpy of the SMT-LIB 2 theory of fixed-width bit- 
vectors defines a unique sort for each positive integer n, which we denote by oj). 
For every positive integer n and bit-vector of width n, the signature contains a 
constant symbol of sort oj), representing that bit-vector, which we denote as 
a binary string of length n. The function and predicate symbols of X'gy are as 
described in the SMT-LIB 2 standard. Formulas of Xgvy are built from variables, 
bit-vector constants, and the function and predicate symbols of Xgv, along with 
the usual logical connectives and quantifiers. We write w[21,...,2n] to represent 
a formula whose free variables are from the set {21,...,2n}. 

The semantics of X gy-formulas is given by interpretations where the domain 
of ojn] is the set of bit-vectors of width n, and the function and predicate symbols 
are interpreted as specified by the SMT-LIB 2 standard. A Xgvy-formula is 
valid in the theory of fixed-width bit-vectors if it is satisfied by every such 
interpretation. 

Table 1 contains the operators from »’gy for which invertibility conditions 
were defined in [15]. We define X; to be the signature that contains only these 
symbols. Xo is the sub-signature obtained by only taking the operators from the 
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upper part of the table. We use the (overloaded) constant 0 to represent the 
bit-vectors composed of all 0-bits. 


2.2 Coq 


The Coq proof assistant is based on the calculus of inductive constructions 
(CIC) [20]. It implements properties as types, and proofs as terms, reducing 
proof-checking to type-checking. Coq has a rich type system, that allows for 
highly expressive propositions to be stated and proved in this manner. One par- 
ticular feature of interest is that of dependent types — types that can depend 
on values — through which one can express correctness properties within types. 
We refer to non-dependent types as simple types. 

The Coq module system — in addition to allowing for principled separations 
of large developments — allows the abstraction of complex types along with 
operations over them as modules. A module signature or module type acts as 
an interface to a module, specifying the type it encapsulates along with the 
signatures of the associated operators. A functor is a module-to-module function. 


3 Invertibility Conditions and Their Verification 


In [15], a technique to solve quantified bit-vector formulas is presented, which is 
based on invertibility conditions. 


Definition 1. An invertibility condition for a variable x in a Spy -literal 
la, s,t] is a formula IC|s,t] such that Vs.Vt. IC[s,t] = Ax. ela, s,t] is valid 
in the theory of fixed-width bit-vectors. 


Example 1. The invertibility condition for ina & s=tist&s=t. 


In [15], invertibility conditions are defined for a representative set of liter- 
als £ over the bit-vector operators of X1, having a single occurrence of x. The 
soundness of the technique proposed in that work relies on the correctness of the 
invertibility conditions. Every literal ¢[x, s,t] and its corresponding invertibility 
condition ICs, t] induce an invertibility equivalence. 


Definition 2. The invertibility equivalence associated with the literal |x, s, t] 
and its invertibility condition IC|s, t] is the formula 


IC|s,t] = 3x. ela, s, t] (1) 


The correctness of invertibility equivalences should be verified for all possible 
sorts for the variables x, s,t for which the condition is well sorted. Concretely, 
one needs to prove the validity of the following formula: 


Vn: N. n > 0 => Vs: ojn].Vé: Oin IC|s,¢] & da : ofp). lx, s, t] (2) 


This was done in [15], but only for concrete values of n from 1 to 65, using 
solvers for the theory of fixed-width bit-vectors. In contrast, Eq. (2) cannot even 
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be expressed in this theory. To overcome this limitation, later work suggested 
a translation from bit-vector formulas over parametric bit-widths to the theory 
of non-linear integer arithmetic with uninterpreted functions [16,17]. Thanks to 
this translation, the authors were able to verify the correctness of 110 out of 
160 invertibility equivalences. For the remaining 50 equivalences, it then seems 
appropriate to use a proof-assistant, as this allows for more intervention by the 
user who can provide crucial intermediate steps. Even for the 110 invertibility 
equivalences that were proved, the level of confidence achieved by proving them 
in a proof assistant would be greater than an automatic verification by an SMT 
solver due to the smaller trusted code-base of proof assistants in relation to those 
of automatic theorem provers such as SMT solvers. 


auto-ind N 


Fig. 1. The level of confidence achieved by the different approaches. 


Figure 1 depicts the level of confidence achieved by the various approaches 
to verify invertibility equivalences. The smallest circle, labelled auto-65, repre- 
sents the approach taken by [15], where invertibility equivalences were verified 
automatically up to 65 bits. While a step in the right direction, this approach 
is insufficient, because invertibility conditions are used for arbitrary bit-widths. 
The next circle, labeled auto-ind, depicts the approach of [17], which addresses 
the restrictions of auto-65 by providing bit-width independent proofs of the 
invertibility equivalences. However, both auto-65 and auto-ind provide proofs 
by SMT solvers, which are less trusted than ITPs. The largest circle (Cog) cor- 
responds to work presented in the current paper which, while addressing the 
limitations of auto-65 via bit-width independent proofs, also provides stronger 
verification guarantees by proving the equivalences in an interactive theorem 
prover. Moreover, with this approach, we were able to prove equivalences that 
couldn’t be fully verified (for arbitrary bit-widths) by either auto-65 or auto-ind. 


4 The BVList Library 


In this section, we describe the Coq library we use and the extensions we devel- 
oped with the goal of formalizing and proving invertibility equivalences. Vari- 
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ous formalizations of bit-vectors in Coq exist. The internal Coq library of bit- 
vectors [9] is one, but it has only definitions and no lemmas. The Bedrock Bit 
Vectors Library [6] treats bit-vectors as words (machine integers). The SSRBit 
Library [5] represents bit-vectors as finite bit-sets in Coq and extracts them 
to OCaml machine integers. Our library is more suited to the SMT-LIB 2 bit- 
vectors, and includes operators that are not fully covered by any of the previ- 
ously mentioned libraries. More recently, Shi et al. [22] developed a library called 
CoqQFBV that presents a bit-vector type as a sequence of Booleans, defines 
operators over it, and proves the correctness of these operations with respect 
to a (machine integer) semantics. [22] uses this library to define a bit-blasting 
algorithm in Coq, that is extracted into an OCaml program to perform certified 
bit-blasting. Since CoqQFBV covers the entire SMT-LIB 2 bit-vector signature, 
it would be a good alternative to ours in formalizing and proving invertibility 
conditions. Our library offers a rich set of lemmas over bit-vector operations that 
makes it suitable for proofs of invertibility conditions and other bit-vector prop- 
erties. Bit-vectors have also been formalized in other proof assistants. Within 
the Isabelle/HOL framework, one can utilize the library developed by Beeren et 
al. [4] to align with SMT-LIB 2 bit-vector operations. Furthermore, Harrison [1] 
presents a formalization of finite-dimensional Euclidean space within HOL light, 
accompanied by an implementation of vectors. 


4.1 BVList Without Extensions 


BVList was developed for SMTCoq [10], a Coq plugin that enables Coq to 
dispatch proofs to external proof-producing solvers. While the library was only 
briefly mentioned in [10], here we provide more details. 

The library adopts the little-endian notation for bit-vectors, following the 
internal representation of bit-vectors in SMT solvers such as cvc5, and corre- 
sponding to lists in Coq. This makes arithmetic operations easier to perform 
since the least significant bit of a bit-vector is the head of the Boolean list that 
represents it. 

Another choice is how to formalize the bit-vector type. A dependently-typed 
definition is natural, since then the type of a bit-vector is parameterized by its 
length. However, such a representation leads to some difficulties in proofs. Depen- 
dent pattern-matching or case-analysis with dependent types is cumbersome and 
unduly complex (see, e.g., [23]), because of the complications brought by unifica- 
tion in Coq (which is inherently undecidable [24]). A simply-typed definition, on 
the other hand, does not provide such obstacles for proofs, but is less natural, as 
the length becomes external to the type. The BVList library defines for conve- 
nience both the dependently and the simply typed version of bit-vectors. It uses 
the Coq module system to separate them, and a functor that connects them, 
avoiding redundancy. The relationship between the two definitions is depicted 
in Fig. 2. 

In BVList, a dependently-typed bit-vector is a record parameterized by its 
size n and consisting of two fields: a Boolean list and a condition to ensure that 
the list has length n. This type, and the corresponding lemmas and properties 
over it, are encapsulated by the BITVECTOR_LIST module of type BITVECTOR. A 
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simply-typed or raw bit-vector representation is simply a Boolean list which, 
along with its associated operators and lemmas is specified by module signature 
RAWBITVECTOR and implemented in module RAWBITVECTOR_LIST. In other words, 
the interface of BVList offers dependently-typed bit-vectors, while the underly- 
ing operators are defined and proofs are performed using raw bit-vectors. 


BITVECTOR_LIST| : BITVECTOR 


RAW2BITVECTOR 


RAWBITVECTOR_LIST| : RAWBITVECTOR 


Fig. 2. Modular separation of BVList 


A functor called RAW2BITVECTOR derives corresponding definitions and proofs 
over dependently-typed bit-vectors within the module for dependent-types, when 
it is applied to RAWBITVECTOR_LIST. The functor establishes a correspondence 
between the two theories so that one can first prove a bit-vector property in 
the context of the simply-typed theory and then map it to its corresponding 
dependently-typed one via the functor module. Otherwise put, users of the 
library can encode theorem statements more naturaly, and in a more expres- 
sive environment employing dependent types. For proofs, one can unlift them 
(by the functor) to the equivalent encodings with simple types, and prove them 
there. 


4.2 Extending BVList 


Out of the 13 bit-vector functions and 10 predicates contained in X1, BVList 
had direct support for 10 functions and 6 predicates. The predicate symbols 
that were not directly supported were the weak inequalities <u, >u, Ss, >, and 
the unsupported function symbols were >a, +, and mod. We extended BVList 
with the operator >>, and the predicates <,, and >, in order to support the 
corresponding invertibility conditions. Additionally, we redefined «< and >> in 
order to simplify the proofs of invertibility conditions over them.! 

We focused on invertibility conditions for literals of the form x os ™ t and 
sox ht, where © and & are respectively function and predicate symbols in Xo. 
Xo was chosen as a representative set because it is both expressive enough (in 
the sense that other operators can be easily translated to this fragment), and 


1 Both the extended library and the proofs of invertibility equivalences can be found 
at https://github.com/ekiciburak/bitvector/tree/frocos23. 
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feasible for proofs in Coq using the library. In particular, it was chosen as one 
that would require the minimal amount of changes to BVList. As a result, such 
literals, as well as their invertibility conditions, contain only operators supported 
by BVList (after its extension with >a, Su, and >u). Supporting the full set 
of operators in X1, both in the library and the proofs is left for future work. 


Fixpoint ule_list_big_endian (x y : list bool) := 
match x, y with 
| [], [] = true 
| [];,- => false 
| _, [] => false 
| xix’, yin y’ > ((eqb xi yi) && (ule_list_big_endian x’ y’)) 
|| ((negb xi) && yi) 


end. 


Definition ule_list (x y: list bool) := 
(ule_list_big_endian (rev x) (rev y)). 


Definition bv_ule (ab: bitvector) := 
if @size a =? @size b then 
ule_list ab 
else 
false. 


Definition bv_ule n (bv1 bv2:bitvector n) : bool := M.bv_ule bvi bv2. 


Fig. 3. Definitions of <, in Coq. 


In what follows, we describe our extensions to BVList with weak unsigned 
inequalities, alternative definitions for logical shifts, and the arithmetic right 
shift operator. 


Weak Unsigned Inequalities. We added both weak inequalities for unsigned 
bit-vectors, <,, and >,. We illustrate this extension via that of the <„ opera- 
tor (the extension of >,, is similar). The relevant Coq definitions are provided 
in Fig.3. The top three definitions (including the fixpoint) cover the simply- 
typed representation, and the fourth, bv_ule is the dependently-typed represen- 
tation that invokes the definition with the same name from module M of type 
RAWBITVECTOR. Like most other operators, <u (over raw bit-vectors) is defined 
over a few layers. The function bv_ule, at the highest layer, ensures that com- 
parisons are between bit-vectors of the same size and then calls ule_list. Since 
we want to compare bit-vectors starting from their most significant bits and the 
input lists start instead with the least significant bits, ule_list first reverses 
the two lists. Then it calls ule_list_big_endian, which we consider to be at the 
lowest layer of the definition. This function does a lexicographic comparison of 
the two lists, starting from the most significant bits. 
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To see why the addition of <,, to the library is useful, consider, for example, 
the following parametric lemma, stating that ~0 is the largest unsigned bit- 
vector of its type: 

VT : On] © <u ~0 (3) 


Without an operator for the weak inequality, we would write it as: 


VT : Ofn]: T <u ~O V £ = ~0 (4) 


Definition shl_one_bit (a: list bool) := 
match a with 
| l=] 
| _ => false :: removelast a 
end. 


Fixpoint shl_n_bits (a: list bool) (n: nat) := 
match n with 
| Oa 
| Sn’ = shl_n_bits (shl_one_bit a) n 
end. 


Definition shl_n_bits_a (a: list bool) (n: nat) := 
if (n <? length a)%nat then 
mk_list_false n ++ firstn (length a -n) a 
else 
mk_list_false (length a). 


Theorem bv_shl_eq: forall (ab: bitvector), bv_shl a b = bv_shl_aab. 


Fig. 4. Various definitions of <<. 


In such cases, since the definitions of <,, and = have a similar structure to that 
of <u, we strip down the layers of <, and = separately, whereas using <u, we 
only do this once. 


Left and Right Logical Shifts. We have redefined the shift operators << 
and >> in BVList. Figure 4 shows both the original and new definitions of <<. 
Those of >> are similar. Originally, < was defined using the shl_one_bit and 
shl_n_bits. The function shl_one_bit shifts the bit-vector to the left by one 
bit and is called by shl_n_bits as many times as necessary. The new definition 
shl_n_bits_a uses mk_list_false which constructs the necessary list of 0 bits 
and appends (++ in Coq) to it the bits to be shifted from the original bit-vector, 
which are retrieved using the firstn function, from the Coq standard library 
for lists. The nat type used in Fig. 4 is the Coq representation of Peano natural 
numbers that has O and S as its two constructors — as depicted in the cases 
rendered by pattern matching n (lines 9-10). The theorem at the bottom of 
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Fig. 4 asserts the equivalence of the two representations, allowing us to switch 
between them, when needed. In the extended library, bv_sh1 defines the left shift 
operation using shl_n_bits whereas bv_shl_a does it using shl_n_bits_a. This 
new representation was useful in proving some of the invertibility equivalences 
over shift operators (see, e.g., Example 4 below). 


Arithmetic Right Shift. Unlike logical shifts that were already defined in 
BVList and for which we have added alternative definitions, arithmetic right 
shift was not defined at all. We provided two alternative definitions for it, very 
similar to the definitions of logical shifts — bv_ashr and bv_ashr-_a. Both defini- 
tions are conditional on the sign of the bit-vector (its most-significant bit). Apart 
from this detail, the definitions take the same approach taken by shl_n_bits and 
shl_n_bits_a from Fig. 4. Operator bv_ashr uses the definition of an indepen- 
dent shift and repeats it as many number of times as necessary, and bv_ashr_a 
uses either mk_list_false or mk_list_true to append the necessary number of 
sign bits to the shifted bits. 


5 Proving Invertibility Equivalences in Coq 


In this section we provide specific details about proving invertibility equiva- 
lences in Coq. We start by outlining the general approach for proving invertibil- 
ity equivalences in Sect.5.1. Then, Sect. 5.2 presents detailed examples of such 
proofs. Section 5.3 summarizes the results and impact of these proofs. 


5.1 General Approach 


The natural representation of bit-vectors in Coq is the dependently-typed repre- 
sentation, and therefore the invertibility equivalences are formulated using this 
representation. In keeping with the modular approach described in Sect. 4, how- 
ever, proofs in this representation are composed of proofs over simply-typed 
bit-vectors, which are easier to reason about. Most of the work is on proving an 
equivalence over raw bit-vectors. Then, we derive the proof of the corresponding 
equivalence over dependently-typed bit-vectors using a smaller, boilerplate set 
of tactics. Since this derivation process is mostly the same across many equiva- 
lences, these tactics are a good candidate for automation in the future. 

When proving an invertibility equivalence [C|s,t] <= da. €[x,s,t], we first 
split it into two sub-goals: the left-to-right and right-to-left implications. For 
proving the left-to-right implication, since Coq implements a constructive logic, 
the only way to prove an existentially quantified formula is to construct the 
literal witnessing it. Thus, in addition to being able to prove the equivalence, 
a positive side-effect of our proofs are actual inverses for x in literals of the 
form ¢[a, s,t]. In Niemetz et al. [16], these are called conditional inverses, as the 
fact that they are inverses is conditional on the correctness of the invertibility 
condition. There, such inverses were synthesized automatically for a subset of 
the literals. In each of our Coq proofs, such an inverse is found, even when the 
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proof is done by case-splitting. This provides a more general solution than the 
one in [16], which did not consider case-splitting. 


Example 2. Consider the literal s >>, x >, t. Its invertibility condition is (s >, 
~s) V (s >, t). The left-to-right implication of the invertibility equivalence is: 


Vs, t: Omn]: (S 2u ~8) V (8 Sy t) > F2 : Oin]: Sat Zut 


Here, case splitting is done on the disjunction in the invertibility condition. 
When s >, ~s is true, the inverse for x is the bit-vector constant that correspond 
to the length of the s, namely n; when s >, t is true, the inverse is 0. 


In addition to BVList, several proofs of invertibility equivalences bene- 
fited from CoqHammer [7], a plug-in that aims at extending the level of 
automation in Coq by combining machine learning and automated reasoning 
techniques in a similar fashion to what is done in by Sledgehammer [21] in 
Isabelle/HOL [18]. CoqHammer, when triggered on some Coq goal, (i) submits 
the goal together with potentially useful terms to external solvers/automated- 
provers, (ii) attempts to reconstruct returned proofs (if any) directly in the Coq 
tactic language Ltac [8], and (iii) outputs the set of tactics closing the goal in 
case of success. As we directly employ these tactics inside BVList, one does not 
need to install CogHammer in order to build the library, although it would be 
beneficial for further extensions. 


5.2 Detailed Examples 


In this section we provide specific examples for proofs of invertibility equiva- 
lences. The first example illustrates the two-theories approach of the library. 


Example 3. Consider the literal s >>, £ <u t. Its invertibility condition is ((s <u 
t V ~(s <s 0)) A t £0). Figure5 shows the proof of the following direction of 
the corresponding invertibility equivalence: 


Vs,t: ofp). (AT : Ofn]: $ >> a <u t) > ((s <u t V a(s <s 0)) A t#0) 


In the proof, lines 8-11 transform the dependent bit-vectors from the goal and 
the hypotheses into simply-typed bit-vectors. Then, lines 12—14 invoke the corre- 
sponding lemma for simply-typed bit-vectors (called InvCond.bvashr_u1t2_rt1) 
along with some simplifications. 


Most of the effort in this project went into proving equivalences over raw 
bit-vectors, as the following example illustrates. 


Example 4. Consider the literal z << s >, t. Its invertibility condition is (t <, 
~0 << s). The corresponding invertibility equivalence is: 


Vs, t: op) (t <u 0 < 8) S Ge: op). TK 8 >u t) (5) 
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The left-to-right implication is easy to prove using ~0 itself as the witness of the 
existential proof goal and considering the symmetry between >, and <,. The 
proof of the right-to-left implication relies on the following lemma: 


VL, 5 : Oln]: (£ < 8) Su (~0 < 8) (6) 


From the right side of the equivalence in Eq. (5), we get some skolem « for 
which z < s >, t holds. Flipping the inequality, we have that t <, £ << s; using 
this, and transitivity over <, and <,, the lemma given by Eq. (6) gives us the 
left side of the equivalence in Eq. (5). 

As mentioned in Sect. 4, we have redefined the shift operators «< and >> in 
the library. This was instrumental, for example, in the proof of Eq. (6). 


Theorem bvashr_ult2_rtl1 : 

forall (n : N), forall (st: bitvector n), 

(exists (x: bitvector n), (bv_ult (bv_ashr_a s x) t = true)) -> 
(((bv_ult s t = true) V (bv_slt s (zeros n)) = false) A 
(bv_eq t (zeros n)) = false). 
Proof. 

introsnstH. 

destruct H as ((x, Hx), H). 

destruct s as (s, Hs). 

destruct t as (t, Ht). 

unfold bv_ult, bv_slt, bv_ashr_a, bv_eq, bv in *. cbn in *. 
specialize (InvCond.bvashr_ult2_rtl n s t Hs Ht); intro STIC. 
rewrite Hs, Ht in STIC. apply STIC. 

now exists x. 
Qed. 


Fig. 5. A proof of one direction of the invertibility equivalence for >>, and <,, using 
dependent types. 


The new definition uses firstn and ++, over which many useful properties 
are already proven in the standard library. This benefits us in manual proofs, and 
in calls to CoqHammer, since the latter is able to use lemmas from the imported 
libraries to prove the goals that are given to it. Using this representation, proving 
Eq. (6) reduces to proving Lemmas bv_ule_1_firstn and bv_ule_pre_append, 
shown in Fig.6. The proof of bv_ule_pre_append benefited from the property 
app_comm_cons from the standard list library of Coq, whereas firstn_length_le 
was useful in reducing the goal of bv_ule_1_firstn to the Coq equivalent of Eq. 
(3). The statements of the properties mentioned from the standard library are 
also shown in Fig. 6. 


Finally, we examine what was considered a challenge problem in the previous 
version of this work [11]. The next example details how we completed the proof. 


Formal Verification of Bit-Vector Invertibility Conditions in Coq 53 


Example 5. Consider the literal (x >> s) >, t. Its invertibility condition is t <u 
(~s >> s). Now consider the following direction of the corresponding invertibility 
equivalence: 


V8,t: Of. t <u (vs >> 8) > Ae: opp. (2 >> 8) >u t (7) 


Figure 7 contains the theorem stating the equivalence, and some lemmas used 
within its proof. A crucial step in the proof of the implication is to rewrite the 
definition of the right shift operator bv_shr to its alternate definition bv_shr_a 
(see Sect. 4.2). Unfolding the alternative definition leads to a case-analysis on 
the following condition: 

toNat(s) < len(zx) 


where toNat casts a bit-vector to its natural number representation, and len 
returns the length of a bit-vector as a natural number. 


Lemma bv_ule_i_firstn : forall (n : nat) (x : bitvector), 
(n < length x)%nat -> 
bv_ule (firstn n x) firstn n (mk_list_true (length x))) = true. 


Lemma bv_ule_pre_append : forall (x y z : bitvector), 
bv_ule x y = true -> bv_ule (z ++ x) (z ++ y) = true. 


Theorem app_comm_cons : forall (x y:list A) (a:A), 
a: (x++ y)=(anx)tty 


Lemma firstn_length_le: forall 1:list A, forall n:nat, 
n <= length 1 -> length (firstn n 1) =n. 


Fig. 6. Examples of lemmas used in proofs of invertibility equivalences. 


The challenge in the proof arises in the positive case of the condition, which 
reduces to a proof of first_bits_zero (see Fig. 7). first_bits_zero says that 
given toNat(s) < len(s), the most-significant len(s) — toNat(s) bits of s are 0. 
As seen in Fig. 4, the second argument to the top-most layer of the shift (called 
from bv_shl_eq) is a bit-vector that specifies the number of times to shift the 
bit-vector in the first argument. This second argument is converted to a natural 
number by the abstract toNat function invoked above, the concrete definitions 
of which are specified in Fig. 7 as list2nat_be_a and list2N. At the same level 
of abstraction, we use rev for the list reversal function corresponding to the 
Coq function of the same name, and firstn also for its Coq namesake (firstn 
n l returns the n most significant bits of l), so that first_bits_zero can be 
specified as follows: 


toNat(s) < len(s) > firstn (len(s) — toNat(s)) (rev(s)) = 0 


The intuition behind its validity is that if the most-significant len(s) — toNat(s) 
bits were not 0 then they would contribute to the value of toNat(s), making it 
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greater than or equal to len(s) and thus falsifying the condition. However, it is 
challenging to convert this intuition into a proof using induction over lists, as 
explained in what follows. 

To prove first_bits_zero, we redefined list2N as a tail-recursive function 
list2NTR. This step was proven to be sound by a lemma of equivalence between 
the two definitions (list2N_eq). Since list2N is not tail recursive, it only begins 
computation at the end of the input list representing a bit-vector. Such a def- 
inition further complicates the proof of first_bits_zero when based on the 
typical induction principle over the structure of the Boolean list underlying the 
bit-vector s. This is because it does not easily reduce (via v-reduction for induc- 
tive definitions [19]), into a useful expression in the step case of the intended 
induction. 

The advantage of tail recursion in this context is best illustrated by Fig.8 
where x is a Boolean variable and xs represents an arbitrary Boolean list. The 


Theorem bvshr_ugt_ltr : forall (n : N), forall (s t : bitvector n), 
(bv_ult t (bv_shr (bv_not s) s) = true) -> 
(exists (x : bitvector n), bv_ugt (bv_shr x s) t = true). 


Lemma first_bits_zero : forall (s : bitvector), 
(N.to_nat (list2N s) < length s)%nat -> 
firstn (length s -N.to_nat (list2N s)) (rev s) = 
mk_list_false (length s -N.to_nat (list2N s)). 


Lemma first_bits_zeroA : forall (s : bitvector), 
(length s >= (list2NTR s))%nat -> 
firstn (length s -(list2NTR s)) s = 
mk_list_false (length s -(list2NTR s)). 


Fixpoint list2N (a: list bool) := 
match a with 
| [] +0 
| x: xs = if x then N.succ_double (list2N xs) else 
N. double (list2N xs) 
end. 


Definition list2nat_be_a (a: list bool) := N.to_nat (list2N a). 


Fixpoint list2NR (a: list bool) (n: nat) := 
match a with 
I] Sa 
| xi: xs = if x then list2NR xs (2 x n + 1) else 
list2NR xs (2 x n) 
end. 


Definition list2NTR (a: list bool) := list2NR a 0. 


Lemma list2N_eq: forall (s: bitvector), 
list2NTR (rev s) = N.to_nat (list2N s). 


Fig. 7. Invertibility equivalence for >> and >, and some lemmas used by its proof. 
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z: bool xs: list bool IH: firstn (len(xs)— toNat(xs)) (rev(xs)) = 0 


8 

Goal: firstn (len(zs)+ 1—toNat(x :: xs)) (rev(x :: xs)) = 0 8) 

z: bool xs: list bool IH: firstn (len(ws) — toNatTR(ws)) (xs) = 0 (9) 
Goal: firstn (len(xs) + 1 — toNatTR(ws ++ [x])) (ws ++ [z]) = 0 


Fig. 8. Sub-goals generated in the proof of first_bits_zero. Note that 0 is a bit-vector 
constant of the appropriate length (list of falses). 


derivation of the goal from the inductive hypothesis (IH) in derivation (8) from 
Fig. 8 is complicated in Coq because the functions firstn and rev are not well- 
matched with list2N, if not incompatible. For instance, observe that the in the 
inductive step (Goal), as the first argument to firstn increases, the number of 
bits fetched from the list increases towards the right. However, due to the little- 
endian notation of bit-vectors and the fact that the list cons function (: :) can be 
seen as incrementing its argument list to its left, the rev function must be used 
to corrects the direction of increase of the second argument to firstn. Despite 
this correction, an induction over s must deal with two structurally different 
lists. 

In contrast, the tail-recursive definition of list2NTR hides the rev func- 
tion. This is illustrated in derivation (9) in Fig.8, where toNatTR corresponds 
to list2NTR. Furthermore, such an induction over lists using append (++) to 
the right, rather than cons to the left is possible thanks to the reverse induc- 
tion principle*. Closing such a goal allowed us to prove the list2NTR-variant 
of first_bits_zero, specified as first_bits_zeroA in Fig.7, and the proof of 
equivalence between the two definitions (list2N_eq) allowed us to use this in 
closing the original goal (7). 


5.3 Results 


Table 2 summarizes the results of proving invertibility equivalences for invertibil- 
ity conditions in the signature Xọ. In the table, V means that the invertibility 
equivalence was successfully verified in Coq but not in Niemetz et al. [17], and 
v means the opposite; Vv means that the invertibility equivalence was verified 
using both approaches. We successfully proved all invertibility equivalences over 
= that are expressible in Xo, including 4 that were not proved in [17]. For the 
rest of the predicates, we focused only on the 8 invertibility equivalences that 
were not proved in [17], and succeeded in proving all of them. 

Our work thus complements [17] in verifying all invertibility conditions in 
Xo for arbitrary bit-widths, by proving all 12 equivalences that were previously 
unverified, and corroborating 7 others that were verified by SMT solvers. It also 
complements [15], which verified all invertibility conditions in X1, but only up 
to bit-width of 65. 


? see rev_ind in https://coq.inria.fr/library /Coq.Lists.List.html. 


56 B. Ekici et al. 


Table 2. Proved invertibility equivalences in Xo where > ranges over the given pred- 
icate symbols. v means that the invertibility equivalence was successfully verified in 


Coq but not in [17], whereas v means the opposite; vA means that the invertibility 
equivalence was verified using both approaches. 


[a] = |Æ | <ul >ul <u] 2u 
=r X t W/ Vv iv vV |v v 
~g X t ⁄“ v Iv vV Iv v 
xz& sat vA Viv |v Iv lv 
a|smxt Viele Nat Me iy 
cKskit i v v v v v 
SKAL "A viv vV |v v 
r>>skit v v iv V v v 
samt / viv vV |v v 
£as Xt vA v Jv vV Iv v 
sarat Viv |v |v |v |v 
rt+smMt v Viv |v Iv lv 


6 Conclusion and Future Work 


We have described our work on verifying bit-vector invertibility conditions in 
the Coq proof assistant, which required extending the BVList library in Coq. In 
addition to describing the library and our extensions to it, this paper presented 
details about the Coq proofs of the invertibility equivalences. These were done 
on a representative subset of the operators from the theory of bit-vectors that 
is well-supported by the extended library. We were able to prove in Coq all the 
equivalences that were left unproven in previous attempts for all bit-widths, and 
also to prove in Coq some equivalences that were proven automatically before, 
thus increasing confidence in their correctness. 

The most immediate direction for future work is proving more of the invert- 
ibility equivalences supported by the bit-vector library. In addition, we plan to 
extend the library so that it supports the full syntax in which invertibility con- 
ditions are expressed, namely X1. This will also increase the potential usage of 
the library for other applications. Another direction for future work is to extend 
the proofs for invertibility conditions where some of the bits are known. Such 
invertibility conditions were introduced by Niemetz and Preiner [14]. However, 
their formal verification for every bit-width is yet to be done. 


Acknowledgements. This work was funded in part by NSF-BSF grant numbers 
2110397 (NSF) and 2020704 (BSF), and ISF grant number 619/21. 
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Abstract. We explore the relationship between weighted path orders 
and (monotonic) semantic path orders. Our findings reveal that weighted 
path orders can be considered instances of a variant of semantic path 
orders that comprise order pairs. This observation leads to a generaliza- 
tion of weighted path orders that does not impose simplicity on their 
underlying algebras. As a result, the generalized version is capable of 
proving termination of term rewrite systems beyond the realm of sim- 
ple termination. In order to assess practicality we provide experimental 
data comparing generalized weighted path orders with the original ones 
as well as other well-known classes of reduction orders. 


Keywords: Term Rewriting - Termination - Weighted Path Order - 
Semantic Path Order 


1 Introduction 


Reduction orders are a fundamental tool in termination analysis of term rewrite 
systems, and they also underlie completion-based automated theorem proving. 
Weighted path orders (WPOs) [27] are known as a versatile class of reduction 
orders; WPOs can simulate (generalized) Knuth—Bendix orders [7,13,16] and 
lexicographic path orders [12], depending on the choice of parameters, namely 
simple monotone algebras and precedences. In fact, weighted path orders are so 
powerful that they characterize simple termination of term rewrite systems [20, 
Definition 6.3.7], that is, a term rewrite system is simply terminating if and only 
if it admits a compatible WPO. Besides automated termination analysis [14, 26], 
WPOs are used in reachability analysis [25], and automated theorem proving 
[11,18]. 

Another well-known class of reduction orders is the class of monotonic seman- 
tic path orders (MSPOs) [4,5], which are a monotonic version of semantic path 
orders (SPOs) [12]. MSPOs take triples of orders (called reduction triples) as 
parameters, and provide a complete characterization of terminating term rewrite 
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systems: A term rewrite system is terminating if and only if it admits a com- 
patible MSPO. However, the relationship between WPOs and MSPOs has not 
been known [24]. 

In this paper, we give a solution to the open problem, demonstrating an effec- 
tive construction of an MSPO from the algebra and the precedence of a given 
WPO. The key of the proof lies in finding a suitable new variant of MSPOs, 
which is described as follows: First, the variant uses lexicographic comparison [4, 
Definition 4.5.1], as the original WPOs [27, Definition 5] are based on this 
comparison strategy. Second, the variant employs reduction triples [4, Defini- 
tion 4.1.19] because an example shows that a variant based on (quasi-)reduction 
pairs [5, Definition 4] leads to an invalid construction. 

The obtained simulation result leads to a generalization of WPOs that 
does not impose simplicity on their underlying algebras. The generalization 
can show termination of term rewrite systems that are not simply terminat- 
ing. This is a sharp contrast to the termination proving power of WPOs. In 
addition, upgrading WPOs to GWPOs can be done with little implementa- 
tion effort, so we anticipate that tools which employ WPOs as reduction orders 
(e.g. [11, 14, 18, 23,25,26]) may benefit from power of GWPOs. 

The remaining part of the paper is organized as follows. After recalling 
notions and notations for term rewriting and WPOs in Sect.2, we introduce 
a slightly modified version of semantic path orders that employs order pairs in 
Sect. 3. In Sect. 4 we show that weighted path orders are instances of semantic 
path orders. Using this fact, we introduce a generalization of WPOs in Sect. 5. 
In Sect. 6 experimental data for (generalized) weighted path orders are reported. 
As in the case of MSPOs [5, Section 5.2], GWPOs are capable of simulating a 
basic version of the dependency pair method [1]. This is discussed in Sect. 7. The 
paper is concluded by stating related work in Sect. 8. 


2 Preliminaries 


Throughout the paper, we assume familiarity with term rewriting [3,20]. First 
we briefly recall basic notions for term rewriting and reduction orders, and then 
introduce weighted path orders. 


2.1 Term Rewriting 


Let F be a signature and VY a countable set of variables with F N V = @. The 
set of all terms built from F and Y is referred to as T(F, V), or just as 7 when 
F and V are clear from the context. When we need to indicate the arity n of a 
function symbol f, we write f( for f. Quasi-orders on the signature are called 
(quasi-)precedences. A quasi-precedence X is called well-founded if its strict part 
> is well-founded. The size |t| of a term t is the number of function symbols and 
variables occurring in t. Let O be a constant with O ¢ F. Contexts are terms 
over FU{O} that contain exactly one O. The term resulting from replacing O in 
a context C by a term t is denoted by Ct]. We write s È t if there is a context C 
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with s = C[{t]. The strict part of > is denoted by >. A substitution is a mapping 
o from variables to terms such that {x € V | o(a) 4 x} is finite. The application 
to of a substitution o to a term t is inductively defined as follows: to = o(t) if t 
is a variable, and to = f(tio,...,tno) if t = f(ti,...,tn). 

A pair (¢,r) of terms is said to be a rewrite rule if £ is not a variable and 
every variable in r occurs in Z. Rewrite rules (4, r) are written by @— r. A set of 
rewrite rules is called a term rewrite system (TRS). Let R be a TRS. We write 
Dr for the set of defined symbols {f | f(41,...,4n) —> r € R}. The relation >r 
is defined on terms as follows: s > t if there exist a rewrite rule l > r E€ R, 
a context C, and a substitution ø such that s = C[éo] and t = C[ro] hold. The 
TRS R is said to be terminating if there is no infinite sequence tı >r t2 >R. 
A relation ~> on terms is closed under contexts if C[s] ~~ C[t] holds whenever 
s ~ t and C is a context, and it is called closed under substitutions or just stable 
if so ~~ to holds whenever s ~~ t and ø is a substitution. We say ~~ has the 
subterm property if s ~~ t for all terms s, t satisfying sœ t. Relations closed under 
contexts and substitutions are called rewrite relations. 

Termination is often shown by using orders. We say that a rewrite relation is 
a rewrite preorder or reduction order if it is a preorder or a well-founded order, 
respectively. A TRS R is compatible with a strict order > if RC >. 


Proposition 1. A TRS R is terminating if R is compatible with some reduction 
order >. 


An ordered F-algebra is a triple (A, {fA}feF, >), where A is a set called a 
carrier, fa is an n-ary function on A (called an interpretation function) associ- 
ated with each f™ € F, and > is a strict order on A. Let A = (A, {fa} per, >) 
be an ordered algebra. A mapping from Y to A is called an assignment for A. The 
interpretation |a] 4(t) of a term t under an assignment a is inductively defined as 
follows: [a] a(t) = a(t) if t is a variable, and [a] 4(t) = f([a]a(ts),.--, [ala(tn)) 
if t = f(ti,...,tn). We write s >a t if [a]a(s) > [a]a(t) for all assignments a. 
The relation >4 is a strict order. Similarly we write s 24 t if [a] a(s) > [a] a(t) 
holds for all assignments a, where > stands for the reflexive closure of >. The 
relation > 4 is a quasi-order, and satisfies >24 - >44 <- 24 C >a. We say that the 
ordered algebra A is 


— simple if fa(a1,...,@j,...,dn) > a; for all f™ e€ F,1 <i <n, and 
Q1,---,4n € A; 

- weakly monotone if f4(a1,...,@i,..-,dn) > fa(ar,...,0,...,@n) for all f™ € 
F, argument positions 1 <i < n, and aj,...,an,b E A with a; > b; 


— simple monotone if it is simple and weakly monotone; and 
— well-founded if > is well-founded. 


If > is well-founded, so is >. If A is a weakly monotone algebra, > 4 is a rewrite 
preorder. If in addition A is simple, 24 has the subterm property > C 24. 


2.2 Weighted Path Orders 


Weighted path orders (WPOs) are reduction orders introduced by Yamada et 
al. [27]. The definition of WPOs is based on the pair of an ordered algebra A 
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and a precedences =. A WPO compares terms s,t as a generalized KBO does: 

First the terms are compared by s >4 t. If only weak inequality s > t holds 

then their root symbols, say f and g, are compared by the precedence X. If again 

only weak inequality f Z, g holds, arguments are compared lexicographically. 
Lexicographic comparison is formalized as follows. Let > be a strict order on 

aset A and let A* denote the set of all strings (tuples) over A. The lexicographic 

extension >' of > is defined on A* as follows: (a1,...,@n) >'™ (b1,...,bm) if 

there is a natural number k < n such that 

— aj = 6; for alll <j < k, and 

— either k = m, or k < mand agy1 > bk+1- 


It is known that >!€ is a strict order on A*. 


Definition 1 ([27]). Let A be an ordered F-algebra and = a precedence. The 
weighted path order >wpo is defined on terms over F as follows: s >wpo t if 


1. s>,t, or 
2. s 24t, s= f(s1,..., Sm), and one of the following conditions holds. 
a. Si Zwpo t for some 1 SiS m. 
b. t = Q(ti,...,tn) and 8 >wpo tj for alll <j <n, and moreover 
(i) f > g9, or 


(a) fX g and (s1,..., 5m) Ss Cyt): 


Here Zwpo denotes the reflexive closure of >wpo- 


Theorem 1 ([27]). Suppose that the signature is finite. For every simple mono- 
tone well-founded algebra and well-founded precedence the induced relation >wpo 
is a reduction order with the subterm property. 


Example 1. Consider the following TRS R taken from [27, Example 9]: 
F(g(x)) > g(f(f(x))) f(h(x)) — h(h(f(x))) 


Let A be the simple monotone algebra on N with fa(x) = ha(a) = x and 
x) =x + 1. Take a precedence = with f > g > h. The relation f(g(x)) >wpo 
g(f(f(x))) is verified by the following derivation: 
f(g(x)) >a F(F(x)) 
f(g(@)) pag(f(f(a))) fg — f(g(x)) >wpo F(F(x)) 
f(g(x)) >wpo a(F(F(x))) 
Here WPO 1 and WPO 2b(i) indicate the corresponding conditions in Definition 


1. Similarly, one can verify f(h(x)) >wpo h(h(f(x))). Therefore, R C >wpo follows. 
Hence, we conclude that R is terminating. 


WPO1 
WPO 2b(i) 


The following example shows that the simplicity condition cannot be dropped 
from Theorem 1. 


Example 2. Any WPO >wpo induced by the weakly monotone but non-simple 
algebra A on N with a4 = 1 and f(x) = 0 lacks well-foundedness as it admits 
the cyclic sequence f(a) >wpo f(f(a)) >wpo f(a). 
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3 Semantic Path Orders Based on Order Pairs 


Borralleras [4, Definition 4.1.19] introduced a variant of SPO that employs a pair 
of a quasi-order and a strict order. This variant compares arguments of terms 
by a multiset order. In order to simulate WPOs which compare arguments in a 
lexicographic manner, we introduce another variant of SPO. 

We say that the pair (>,>) of a quasi-order = and a strict order > is an 
order pair if > - >- > C >. The inclusion is referred to as compatibility. We say 


that an order pair (43,3) on terms is stable if both J and 3 are stable. 
Definition 2. Let (3,3) be a stable order pair on T \ V.1 The semantic path 
order >spo (SPO) is defined on terms as follows: s >spo t if s = f(S1,-.-, Sm) 
and one of the following conditions hold: 
1. Si Sspo t for some 1 <i<gm. 
2. t=g(ti,...,tn) and s >spo tj for all 1 < j <n, and moreover 

a. st, or 

b. s It and (51, , Sm) >$ igveeg tt: 


Here >spo denotes the reflexive closure of >spo- 


Remark 1. The standard definitions of SPOs ( [12] and [4, Definition 4.1.19]) use 
the multiset extension of >spo in SPO 2b instead of the lexicographic extension. 
The lexicographic version of SPOs, introduced by Borralleras [4, Definition 4.5.1], 
can be obtained by setting I to the strict part of J in Definition 2. 


~N 


Example 3. Lexicographic path orders (LPOs) are special instances of SPOs. 
Let = be a precedence. Define f(s1,..-, Sm)  g(ti,.--,tn) by f Z g, and let 
J be the strict part of J. The semantic path order induced by (43,3) is the 
lexicographic path order induced by =. 


~N 


Let (43, I) be a stable order pair on T \ V and let >spo be the semantic path 
order induced by (4, 3). The transitivity, reflexivity, and sg of >spo are 
straightforward. A small remark is that the compatibility = - C Jis used 


in the proof of the transitivity. 7 


Lemma 1. The SPO >spo is a stable strict order. 


When the signature is infinite, the lexicographic version of SPOs is not well- 
founded in general even if 3 is well-founded. This forms a contrast to the multiset 
versions of SPOs mentioned in Remark 1. 


Example 4. Consider the signature consisting of a), b©, and £0) for all num- 
bers i € N. Let = be a well-founded precedence satisfying a > b and f; = f; for 
all ¿j € N. The pair (4,3) defined as in Example 3 is an order pair with 3 


well-founded, but the SPO >spo induced from (J, I) admits the infinite chain: 
f(a) > spo fo(b, a) > spo f3(b, b, a) > spo i 
See [22, Section 3] and [19, Section 3] for related discussions. 


1 The restriction to T \ V is not essential but meant to be a minimum requirement. 
Observe that Definition 2 uses the order pair only when s and t are not variables. 
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Well-foundedness of >spo is restored by assuming existence of an upper bound 
of arities. We refer to this property as boundedness of the signature. Needless to 
say, a signature is bounded whenever it is finite. 

Hereafter we assume that 3 is well-founded and F is bounded. For showing 
that >spo is well-founded, we adopt Buchholz’s method [6]. One can find a similar 
proof in [27, Lemma 8]. We write SN(>spo) for the set of all terms t such that 
there is no infinite descending sequence t >spo t1 >spo t2 >spo -*: Starting from 
t.2 The following properties are immediate: 


— The restriction of >spo to SN(>spo) is a well-founded order on SN(>spo)- 
— t € SN(>spo) if u E SN(>spo) for all terms u with t >spo u. 


Buchholz’s method proves well-foundedness by well-founded induction. To 
express our well-founded order for induction, we recall the notion of the lexico- 
graphic product of order pairs. Let (21, >1),---, (Zn, >n) be n order pairs on 
sets A1, .. . , An, respectively. The lexicographic product (21, >1)®-+:®(2n, >n) 
is the strict order > defined on A; x- - -X An as follows: (a1,...,@n) > (b1,..., bn) 
if there exists an index k € {1,...,n} such that a, >, by and a; Z; 6; for all 
1 < j < k. Note that the lexicographic product > is well-founded if every >; is 
well-founded. 

Given a set A, we write AS* for the union of A’ for all i < k. If a strict order 
> on A is well-founded, then the restriction of >! to AS* is also well-founded, 
see [19, Section 3]. Thus, the lexicographic product >> given by 


(J, 5) 8 (>20 >o) ® (>, >) 


is a well-founded order on (T \ V) x T<™ x T. Here M stands for the maximum 


arity in the signature F, and >{ for the reflexive closure of >X. 


Lemma 2. The term u belongs to SN(>spo) whenever t = f(t1,...,tn) >spo U 
and ti,...,tn E SN(>spo)- 


Proof. We show the claim by well-founded induction on (t, (t1,..., tn), u) with 
respect to >>. Here we proceed by analyzing the derivation of t >spo u. If t >spo 
u is derived from SPO 1 then ti spo u for some i € {1,...,n}. In this case 
u € SN(>spo) trivially follows from t; € SN(>spo). If t >spo u is derived from 
SPO 2a or SPO 2b, then u is of the form g(u1,..., Um) and t >spo uj for all 
j € {1,..., m}. From ub uj we have (t, (t1,... tn) u) > (t, (t1,---, tn), uz). So 
from the induction hypothesis u; € SN(>spo) for each j. For showing our goal 
u E€ SN(>spo) fix an arbitrary term v with u >spo v. We further distinguish the 
case of SPO 2a and that of SPO 2b. 


a. If t >spo u is derived from SPO 2a then t 3 u. Thus, (t,(ti,...,tn),u) > 
(u, (u1, ..-, Um), v), and the induction hypothesis yields v € SN(>spo). 


? SN stands for strong normalization, which is another name of termination. 
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b. If t >spo u is derived from SPO 2b then we additionally have t J u and 


(ti,... tn) S (u1, ..., Um). Thus, (t, (t1,...,tn),u) > (u, (u1,..., Um), v) 
holds. So from the induction hypothesis we obtain v € SN(>spo)- 


In either case v € SN(>spo). So we conclude u E€ SN(>sp0)- 


Lemma 3. The relation >spo is well-founded. 


Proof. We show that t € SN(>spo) by induction on |t|. If t is a variable trivially 
t € SN(>spo). Otherwise, Lemma 2 applies. 


Theorem 2. Every semantic path order is a stable well-founded order, provided 
that the signature is bounded. 


In general, semantic path orders are not closed under contexts. For a remedy, 
Borralleras et al. [5] propose the use of another preorder with the harmony 
property. This results in monotonic semantic path orders. 


Definition 3 ([4, Definition 4.1.20]). A triple (=, J, I) is a reduction triple if 


> is a rewrite preorder on terms, (32,3) is a stable order pair on T\ V with 3 


well-founded, and 2 and J have the harmony property, meaning that for every 
f™ €F the implication 


s Zt SF (Sige os FSi ey Sn) SF lru byeeey Sh) 


holds for all terms s1,...,5n,t and argument positions 1<i<n. 


2m J) 


Definition 4. Let (2,,7) be a reduction triple, and let >spo be the semantic 


path order induced from (2,3). The monotonic semantic path order s >mspo t 


at — 


(MSPO) is defined as s Z t and S >spo t. 


Theorem 3. Every monotonic semantic path order is a reduction order, pro- 
vided that the signature is bounded. 


Proof. The proof due to Borralleras et al. [5, Theorem 2] goes through. 


4 Simulating WPOs by SPOs 


We show that WPOs are instances of SPOs by constructing a suitable order 
pair (J, 3) from a weakly monotone well-founded algebra A and a well-founded 
precedence =. For terms s = f(51,...,5m),t = g(ti,..-,tn) we write s Jt if 
s >, t, or both s >4 t and f X g. Similarly, we define s I t if s >, t, or 
both s >24 t and f > g. It is worth noting that the proof of [27, Lemma 8] also 


combines the interpretation order and precedence in a lexicographic manner. 


Lemma 4. The pair (3,3) is a stable order pair with I well-founded. 


ra= 
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In the remaining part of the section we consider the WPO >wpo induced by 
A and Z, and the SPO >spo induced by the corresponding order pair (J, 7). 
Note that 3 is not a strict part of J in general, as >, is not necessarily the 
strict part of 24. This is why we decoupled 3 from J in Definition 2; see also 


Remark 1. as 


Example 5. Let the signature F = {f)}. Consider the trivial precedence f > f 
and the algebra A over the carrier N with the interpretation f4 (x) = 2a. On the 


one hand we have f(f(x)) J f(x) from f(f(x)) 24 f(a) but not f(x) J f(f(x)) as 


f(x) ža f(f(x)). On the other hand f(f(x)) 3 f(x) does not hold. 
We illustrate how the derivation of >wpo in Example 1 is simulated by the 
semantic path order. 


Example 6 (continued from Example 1). From f(g(x)) >, g(f(f(x))) and f > g 
the inequality f(g(x)) 3 g(f(f(x))) is obtained. Moreover, we have f(g(x)) > 
f(f(x)). Since >4 has the subterm property, the subterm f(x) of f(f(a)) also 
satisfies f(g(x)) >. f(x). Thus we obtain f(g(x)) 3 f(f(a)), f(x). Therefore, 
f(g(x)) >spo g(f(f(x))) is verified as follows: 


Eee? p64 
E) Zp. SPO1 
f(g(x)) I f(x)  f(g(x)) >spo x ee 
f(g(x)) I f(F(x)) F(g(a)) >spo f(x) See 
f(g(x)) I g(F(F(x))) f(g(x)) >spo f(f(x)) 


SPO 2a 

F(g(@)) >spo g(f(f(x))) 
Similarly, f(h(a)) >spo h(h(f(x))) can be verified. Hence, the inclusion R C >spo 
holds. Observe that the use of WPO 1 in Example 1 is replaced by successive 
application of SPO 1 and SPO 2a. 


As shown in the example, the subterm property of >, is a key for filling in 
the gap between >spo and >wpo- 


Lemma 5. Suppose that A is simple. If S >wpo t then s >spo t. 


Proof. We prove the claim by induction on |s|+ |t|. Let s = f(s1,..., Sm) >wpo t- 
Depending on the derivation of s >wpo t, we distinguish five cases. 


— Suppose that t is a variable and s >wpo t is derived from s >, t. One can 
verify that t occurs in s. Because s is not a variable, s > t follows. By the 
subterm property of >spo we obtain s >spo t. 

— Suppose that t = g(ti,...,tn) and s >wpo t is derived from s > y t. From 
s >, t we obtain s I t. Since A is simple, for every 1 < j < n we have 
s >A t 2, tj, which leads to s >4 tj. Hence, s >spo t is derived as follows: 


Vj. s >A tj 
a 2 WPO1 


I.H. 
SPO 2a 


Vj. S >wpo tj 


sat Vj. S >spo tj 


S >spo g(ti,-.- te) =t 
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— Suppose that s >wpo t is derived as follows: 


s2zat Si Zwpo t WPO2a 


8 = f (Siyer Sn) >wpo t 


By the induction hypothesis we have si >spo t for some i, and thus s >spo t. 
— Suppose that s >wpo t is derived as follows: 


seat fr-g Vj. 8 >wpo ty 


WPO 2b(i) 
s= f(S1,.--,8n) >wpo g(ti,.--,tm) =t 


From s >, t and f > g we obtain s 3 t. Thus, we have: 


Vj. S >wpo tj 
sit Vj. 8 >spo tj 
s= f(S1,-.-,8n) >spo g(ti,---;tm) =t 


— Suppose that s >wpo t is derived as follows: 


S246 7 OH NiO pat (S1;---;Sn) >Ko Giese) 


WPO 2b(ii) 
$ =f Sig.2e3 Sn) Sapo 9 (bins tm): =e 
From s >, t we obtain s J t. Thus, we have: 
Yj. 5 >wpo tj ($1,---)8n) ee (Cee i 
_ ——— LH. .H. 
sit Vj. S >spo tj (S740 Sn) a (ti... tm) 
SPO 2b 


s= f(S1,.--,8n) >spo g(t1,---,tm) =t 


In any case we have s >spo t. 


Next we prove the converse direction of Lemma 5. The next lemma is a basic 
property of WPOs. 


Lemma 6. Ifs >wpo t then s 24t. 


Lemma 7. Suppose that A is simple. If 8 >spo t then s >wpo t. 


Proof. We prove the claim by induction on |s| + |t|. We distinguish three cases, 
depending on the derivation of 5 >spo t. 


— Suppose that s >spo t is derived as follows: 
Si 2 spo t 


s = f(S1,.--,5n) >spo t 


SPO1 
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The induction hypothesis yields si >wpo t for some i. By Lemma 6 and the 
subterm property of >24 we have s >, t. Thus, we obtain the following 
derivation of s >wpo t: 


S ZA t Si Zwpo t 


WPO 2a 
s= Fy veya) >wpo t 
— Suppose that s >spo t is derived as follows: 
sot Vj. 8 >spo t; 
a SPO 2a 


s = f(S1,..-, Sn) >spo J(ti,.-- tn) =t 


According to the definition of s 3 t, we further distinguish two subcases. If 
s >4 t then s >wpo t is immediate. Otherwise, s >4 t and f > g hold. In this 
case we derive s >wpo t as follows: 


Vj. S >spo tj 
szat frg Vj. 5 >wpo tj 
s = f(S1,..-, Sn) >wpo g(t1;,--., tm) =t 


LH. 
WPO 2b(i) 


— Suppose that s >spo t is derived as follows: 


sat Vjes >patj (sissa) Sees asin) 


s = f (Sig ---, Sn) Papa g(ti;--- tm) =t 


SPO 2b 


Because of s J t, we have s >4 tor both s >4 tand f Z g. In the former case 


S >wpo t is immediate. In the latter case s >wpo t is derived by WPO 2b(ii) 
as follows: 


Vj. S >spo tj (s1,--+,8n) Se (tige sy tm) LH 
——_—— I.H. H. 
s>at frag Vj. s >wpo tj (S1)++<78n) SIs (tisti) 


s = f(s1,---,8n) >wpo g(t1,...,tm) =t 


In any case we have s >wpo t. 


As a consequence, >wpo and >spo coincide, provided that A is simple. This 
result can be extended to monotonic semantic path orders. 


Lemma 8. The triple (>24, J, 3) is a reduction triple. 


Let >mspo denote the monotonic semantic path order induced from >, and 
>spo- Since s >wpo t implies s >,4 t (Lemma 6), 5 >mspo t is equivalent to 
S >spo t. By using this equivalence together with Lemmata 5 and 7, we obtain 
the following result. 


Theorem 4. The three orders >wpo, >spo, and >mspo coincide, provided that A 
is simple. 
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5 Generalized Weighted Path Orders 


According to Theorem 4, weighted path orders can be defined as monotonic 
semantic path orders. Moreover, Lemma 8 reveals that even for non-simple alge- 
bras the construction of reduction triples is valid. This observation suggests a 
generalization of weighted path orders, which does not impose simplicity on 
algebras. Besides, we exploit the fact that stable order pairs need not be closed 
under contexts, marking root symbols of function applications; see [1] and [5, 
Definition 5]. 

Let F be a signature. For each f € F we associate a marked function symbol 
fË ¢ F of the same arity. The set {f* | f € F} is denoted by F}. For each 
term t = f(ti,...,tn) E€ T(F, V) we denote f#(t,,...,tn) by tË. Let A be a 
weakly monotone well-founded (FUF*)-algebra and = a well-founded precedence 
on F. The pair (J*, 3!) of relations on T(F,V) \ V is defined as follows: Let 
s = f(s1,..., Sn) and t = g(t1,...,tm). We write s J tif st >4 t’, or st >, t 
and f X g. Similarly, we write s IË t if s* >4 t?, or st >y t? and f > g. The 
relation = is defined as the restriction of >4 to T(F,V). 


Proposition 2. The triple (>,—*, 3!) is a reduction triple on T(F,V). 


mnie 


Definition 5. The generalized weighted path order (GWPO) >gwpo induced 
from A and = is the monotonic semantic path order induced from (2 J’ J’). 


NIN )— 


Corollary 1. Every generalized weighted path order is a reduction order, pro- 
vided that the signature is bounded. 


For convenience, we reformulate the definition of >gwpo in the style of Defi- 
nition 1. 


Definition 6. The relation >wpoo is defined on terms as follows: s >wpo t if 
s = f(s1,..., Sm) and one of the following conditions hold. 


1. Si Zwpo t for some 1 <i<m. 

2. t=9(t,...,tn), st 24 t, ands >wpo' tj for all 1 < j <n, and moreover 
a. sË >a tË, 
b. f >g, or 
c. fZ g and (s1,...,Sm) Snol (igs ig ty ls 


Proposition 3. The SPO >spo induced from (a, JË) coincides with >wpor- For 
all terms s and t the relation s >gwpo t is equivalent to s Za t and 5 >woo t. 


Corollary 2. The relations >gwpo and >wpo coincide, provided that A is simple 
and fa(a1,-.-,0n) = fila, 2n) for all f™ EF. 


Since polynomial interpretation orders [15] and Knuth—Bendix orders [13] as 
well as LPOs are simulated by WPOs [27], they are also subsumed by GWPOs. 
We demonstrate termination proofs by GWPOs with a few examples. All exam- 
ples are not handled by WPOs. 
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Example 7. Consider the TRS R for round-up division: 
p(0) —0 x-0>4 0+s(y) > 0 
P(s(z)) >x#  ax—sly) > p(z)—y s(x) + S(y) > s((x— y) + s(y)) 


Let A be the weakly monotone algebra on N with the interpretations 


0a=0 sa(xz)=xr+1 pa(e)=U F-p~y=U T+AYST 


and let = be an arbitrary precedence. The GWPO induced from A and & orients 
all rules in R. In particular, x—s(y) >wpo p(x)—y is derived from the inequalities 
a—'s(y) >4 p(x) —* y and z —* s(y) >. p*(2). 


Example 8. Consider the TRS R taken from [2, Example 4.28], which computes 
the bit length of a natural number: 


half(0) +0  half(s(0)) — 0 half(s(s(a))) — s(half (x)) 
bits(0) +0 _ bits(s(a)) — s(bits(half(s(x)))) 


Let A be the weakly monotone algebra on N with: 
04=0 sy(a4)=a4+1 © halfy(x) = max{0,2-—1}  bitsy(a) =a 
04, =0 s! (a) =x+1 half? (x) = max{0, x — 1} bits", (x) =g 


The GWPO >gwpo induced by A and a precedence = with half, bits > s sat- 
isfies R C >gwpo as l 24 r and £ >wpø r for all rules £ — r € R. In par- 
ticular, bits(s(x)) >wpor s(bits(half(s(x)))) is derived as follows. The inequality 
bits(s(x)) >wpor bits(half(s(x))) is derived from repeated application of WPO’ 2a: 


S(t) Zwpo! S(2) 
bits*(s(a)) >4 half#(s(x))  bits(s(a)) >wpor s(x) 
bits#(s(a)) >a bits*(half(s(2))) bits(s(x)) >wpor half (s(x)) 
bits(s(a)) >wpor bits(half(s(a))) 


Thus, bits(s(z)) >wpor s(bits(half(s(z)))) follows from WPO’ 2b with bits > s and 
bits’ (s(x)) > 4 s#(bits(half(s(z)))). 


6 Experimental Results 


In order to evaluate GWPOs in termination analysis we implemented a prototype 
termination tool based on Proposition 1 and Corollary 1. Following the automa- 
tion techniques of WPO [27], we search a suitable weakly monotone well-founded 
algebra from two classes of algebras over N: One is linear interpretation and the 
other is maxz/plus interpretation. Since simplicity of algebras is not required for 
GWPOs, we may use more general forms of interpretations. 
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Linear Interpretations. Algebras A of this class use linear polynomials over 
N like Example 7. For each f™) € FU FË its interpretation is of the form 
fal@1,---,2n) = Co + @1 +++: + Cntn where co € N and cy,...,¢n € {0,1}. 
Simple monotone algebras for WPOs°® are obtained by setting c1 = -+ = Cn = 
1, fa = fh for all f™ € F, and those for Knuth—Bendix orders (KBOs) are 
obtained by further restriction for admissibility, see [27]. Comparison of linear 
polynomials is reduced to that of coefficients by using the following trivial fact: 


Proposition 4. Let f(x£1,..., £n) = Co + C121 +++: +Cn%n and g(£1,..., En) = 
do + dızı +--+ dnn be linear polynomials over N. The next statements hold. 


- f 2g if and only if co > dọ and ci > d; for alll <i<n. 
- f >g if and only if co > do and ci > d; for alll <i<n. 


Here f > g (f > g) means that f(a1,...,@n) > g(a1,.--,Qn) (f(ai,---,@n) > 
glai, ...,an)) for all a1,...,an EN. 


Max/plus Interpretations. Algebras A of this class use a combination of + and 
max like Example 8. For each f™) € FU F? its interpretation is of the form 
fa(@1,---,2%n) = max{co, c1 +C4%1,°°+ ,Cn + Ch En} where co E N, c1,..., Cn EZ 
and ci,...,c), E€ {0,1}. Simple monotone algebras for WPOs are obtained by 
imposing c1,...,Cn E N, =---=c, = l1, fa i for all f™ € F, and alge- 
bras for lexicographic path orders (LPOs) are obtained by additionally setting 
Co = Cy = +++ = Cn = 0 for all f™ € F as in [27]. The restriction c1,...,Cn € N 
is necessary for WPOs because allowing c1,...,Cn < 0 results in non-simple 
interpretations such as max{0, x — 1}. Under this form of algebras, an interpre- 
tation of a term is flattened to the form of max{g1,...,9m} where g1,..-,Gm are 
linear polynomials over N. So comparison of max/plus interpretation is reduced 
to that of coefficients, using the following trivial fact and Proposition 4 in turn: 


Proposition 5. Let G and H be non-empty sets of linear polynomials over N. 
The next statements hold. 


- maxG > max H if and only if for every h € H there exists a linear polynomial 
gEG with g Sh. 

- maxG > max H if and only if for every h € H there exists a linear polynomial 
gEG withg>h. 


Since precedence constraints can be regarded as inequalities on natural num- 
bers [28], searching a suitable combination of a precedence and an interpretation 
is done by solving linear arithmetic constraints (with if-then-else expressions). 

The problem set for experiments consists of 1511 term rewrite systems from 
version 11.3 of the Termination Problem Database (TPDB) [21]. The reference 
implementation uses the SMT solver Z3 [17] as an external tool for solving linear 
constraints. The experiments were run on a PC with Intel Core i7-1065G7 CPU 
(1.30 GHz) and 16 GB memory. 


3 Our WPOs based on linear interpretations correspond to WPO(Sum) by Yamada 
et al. [27] but without status functions. 
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Table 1. Experiments on 1511 TRSs from TPDB 11.3. 


interpretations linear maz/plus 

order KBO WPO GWPO LPO WPO GWPO 
proved TRSs 103 122 357 149 221 385 
timeouts (60 sec) 8 9 9 12 12 28 


Now let us discuss the experimental results. Table 1 shows that, as a whole, 
use of non-simple algebras substantially improves termination analysis, at the 
small cost of extra running time. In particular, in the case of linear interpretation, 
GWPOs significantly outperform WPOs. As a matter of fact, linear WPOs are 
unable to orient variable duplicating rules 4 — r such as f(x) — g(x, x) since 
£>, r cannot be satisfied, but this does not apply to GWPOs based on linear 
interpretations with {0, 1}-coefficients. In the case of max/plus interpretations 
there are two TRSs (with over 100 rules) that are proved to be terminating by 
WPOs, but not by GWPOs due to the time limit. This indicates that using 
non-simple algebras for max/plus interpretation can result in increase of search 
space. This is not the case for linear interpretations. 


7 Simulating Dependency Pairs by GWPOs 


The powerfulness of GWPOs revealed in Sect.6 can partly be explained by the 
fact that GWPO is capable of simulating a basic result of the dependency pair 
method [1]. To show the fact, we recall the dependency pair method. The set 
DP(R) of dependency pairs of a TRS R is defined as follows: 


DP(R) = {E > g’ (t,,...,tn) | L> r ER, rÈ g(ti,...,tn), and g € Dr} 


An order pair (>, 3) on terms is a reduction pair if > is a rewrite preorder and 
J is a well-founded stable order. The following theorem states a basic result of 
the dependency pair method. 


Theorem 5 ([1]). A TRS R is terminating if R C > and DP(R) C 3 for 


some reduction pair (2,3). 


We illustrate Theorem 5, using the fact that every weakly monotone algebra 
A on N induces the reduction pair (>, >). 


Example 9. Consider the TRS R = {f(f(x)) > f(g(f(x))), f(x) —> g(x)}. We 
show the termination of R using Theorem 5. The set DP(R) consists of the two 
dependency pairs: 


F(F(a)) > Fe (F(«))) FF F(a) > f(z) 


t The implementation and the detailed experimental data are available at: https: // 
www.jaist.ac.jp/project /maxcomp/23frocos/ 
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By taking the {f,g, f#,g#}-algebra A with the interpretations 


fa(x) =a4+1 ga(z) =0 fi (x) =a gh (x) =1 
the inclusions R C >, and DP(R) C >4 hold. Hence, R is terminating. 


We show that every termination proof by Theorem 5 with a weakly monotone 
algebra on N can be simulated by a GWPO. This class of algebras include linear 
polynomial interpretations and max/plus interpretations described in Sect. 6. 
Let R be a TRS and A a weakly monotone (F U F*)-algebra on N satisfying 
R C >y and DP(R) C >4. Define the (F U F*)-algebra B on N by 


fa(a1,...,Qn) = fa(ar,...,@n) 

# è 
M _ FAG Male iff EDR 
fsla, “14n) l otherwise 


for each f™ € F. Let >gwpo and >wpo’ denote the orders induced from B and 
an arbitrary but fixed precedence. First, let us see that R C >gwpo holds for the 
last example. 


Example 10 (continued from Example 9). The corresponding algebra B is: 
fe(z)=c+1  gs(z)=0 = f(a) =at1 — gh (x) =0 


We have R C >g by construction. The inequality f(f(x)) >wpo f(g(f(x))) is 
derived by successive application of WPO’ 2a as follows: 


T Zwpo T 
FF(F(x)) >g f(x) F(x) >w z£ 
fF(F(x)) >s g'(f(x2)) F(F(a)) >wpor F(x) 
(F(x) >g f (e(f(x))) F(F(@)) >w g(f(2)) 


The inequality f(x) >wpor g(x) follows from f*(x) >g g#(x). Hence R C >gwpo- 
Note that neither f#(f(x)) > g*(f(x)) nor f#(x) >4 g#(a) holds. 


Now we verify that R C >gwpo holds in general. By construction R C >g 
is immediate from R C >4. So it remains to show R C >wpo. We prove the 
following stronger property. 


Lemma 9. Let £— r E€ R. For every subtermt ofr the relation L >woo t holds. 


Proof. We use structural induction on t. If t is a variable, then x must be a 
subterm of £, and thus £ >wpo t. Otherwise, t is in the form of g(t1,...,t,). The 
induction hypothesis yields £ >wpo tj for all 1 < j < n. We claim @ >g tË, from 
which the desired inequality £ >wpo t follows by WPO’ 2a. To show the claim, 
consider an arbitrary assignment a for B. Depending on g, we distinguish two 
cases. 
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- If g € Dr then [a]g(é*) = [a] a(#) +1 > 0 = [alg (t*). 
- If g € Dr then # — t? € DP(R). The assumption DP(R) C >4 yields the 
inequality # >, t#. Thus, [a]g(#) = [a] 4(@) +1 > [ala(t#) +1 = [a]g(t?). 


In either case [a]g(#) > [a]g(t*) is obtained. Hence, # >g t# holds. 


Theorem 6. The inclusion R C >gwpo holds. 


8 Conclusion 


We have shown that weighted path orders can be simulated by a suitable variant 
of SPOs based on order pairs, and introduced a generalization of WPOs whose 
termination proving power goes beyond the realm of simple termination. To 
conclude the paper, we discuss related work and future work. 


Simulating KBOs by SPOs. A key observation for simulating WPOs by SPOs 
is that weight comparison can be simulated by successive application of SPO 1 
and SPO 2a as observed in Example 6. Another observation is that the SPOs are 
already reduction orders without a help of harmonious rewrite preorders. These 
two observations owe to Geser’s work [9, Theorem 5], where it is shown that 
extended KBOs [7, Sect. 5] can be simulated by SPOs. Unifying our result and 
Geser’s result is future work. 


General Path Orders. In this paper the lexicographic versions of path orders 
were investigated. However, it is very likely that the same result can be obtained 
even if we adopt multiset comparison or status functions. General path orders 
(GPOs) [8,10] are a unifying framework for such extensions, parameterizing the 
way to compare arguments. It is worth investigating simulation results between 
GPOs and WPOs by extending the parameters of GPOs so as to take order 
pairs. 


Reduction Pairs Based on WPOs. In order to build reduction pairs from WPOs 
Yamada et al. [27, Sect. 4] extended the definition of WPOs by the notion of 
partial status function n. The extension allows us to specify argument positions 
m(f) = [t1,---;%m] compared in WPO 2b and WPO 2b(ii) for each function sym- 
bol f(™ € FU FË. We anticipate that partial status functions can also be inte- 
grated into GWPOs and the thus-obtained version characterizes the reduction 
pair version of WPOs. 
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Abstract. KBO constraint solving is very well-known to be an NP- 
complete problem. Motivated by the needs of the family of SCL calculi, 
we consider the particular case where all terms occurring in a constraint 
are bound by a (single) ground term. We show that this problem and 
variants of this problem remain NP-complete even if the form of atoms in 
the constraint is further restricted. In addition, for a non-strict, partial 
term ordering solely based on symbol counting constraint solving remains 
NP-complete. Nevertheless, we provide a new simple algorithm testing 
KBO constraint solvability that performs well on benchmark examples. 


Keywords: KBO Constraint Solving - NP-complete problem - Weight 
Ordering Constraint Solving 


1 Introduction 


The family of SCL calculi (Clause Learning from Simple Models) [2,5,13] per- 
form reasoning on a set of first-order clauses. They develop a trail of ground 
literals with respect to a ground term (atom) bound 8 and an ordering <. All 
ground literals on the trail are < (or <) smaller than the ground term (atom) 
8 and ~ should in particular have the property that for any term t there are 
only finitely many literals s such that s < t. In case SCL does not detect a con- 
flict with respect to a finite, exhaustive trail of ground literals, they constitute 
a model candidate for the clause set [4]. If SCL detects a conflict it learns a 
new first-order non-ground clause. It is derived by resolution and factoring with 
guidance from the trail. A natural choice for the ordering < is the Knuth-Bendix 
(KBO) ordering [9]. For the ground case, a KBO relation can be efficiently com- 
puted [14]. All SCL calculi propagate literals from clauses with respect to the 
trail. For example, given a trail [P(a)] and a clause —P(x) V R(x, y) the lit- 
eral R(a,y) could be propagated. The SCL theory only enables ground literals 
on the trail, however, in practice it is not affordable to put all groundings of 
R(a, y) on the trail that are < smaller than 8. Therefore, we already considered 
trail literals with variables when we developed a two-watched literal scheme for 
SCL [3]. Recall that this propagation situation is not exceptional as typically 
not all literals in a clause carry all occurring variables. The consequence of this 
© The Author(s) 2023 
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extension is that for SCL we now need to decide solvability of conjunctions of 
inequations t; < 3 where the t; may contain (shared) variables, i.e., we have to 
decide solvability of a particular form of KBO constraints if < is the KBO. 

For the SCL(EQ) calculus [13] the requirements on constraint solving get 
more sophisticated. Now the trail is a sequence of unit (in)equalities and propa- 
gation and conflicting clauses are decided with respect to the resulting congru- 
ence. For an extended congruence closure algorithm [6,8,15,16] we need now in 
addition to inequations t; < 8 to consider inequalities t; Æ s; in order to separate 
congruence classes. In its simplest form, constraints consist of inequations t; < 8 
and inequalities t; Æ s; where 3 and the s; are ground, so called simple right- 
ground constraints, Definition 4. In a more general setting, the s; carry variables 
and then a quantifier alternation on variables occurring in s; but not in t; needs 
to be considered. Such constraints are called alternating, Definition 27. 

In this paper we investigate the complexity of all these variants with respect 
to a KBO <, Definitions 3, 4, 25, 27, but also a weaker non-strict ordering based 
on pure symbol counting, Definition 22. Except for constraints bound by a single 
ground term, Proposition 26, all problems are NP-hard, Propositions 5, 21, 24, 
28. 

Korovin and Voronkov developed a decision procedure [10] for KBO con- 
straints consisting of inequations s; < t; only and refined it to an NP algo- 
rithm [11]. According to Léchner [14], these results are “of more theoretical inter- 
est” because they are “too involved to be implemented with reasonable effort”. In 
fact, to the best of our knowledge we present the first implemented algorithm for 
KBO constraint solving in this paper. Later, Korovin and Voronkov [12] showed 
that checking satisfiability of a KBO constraint consisting of a single inequation 
s < t can be done in polynomial time. For the special case of a right-ground con- 
straint consisting of a single inequation s < t, what their algorithm essentially 
does is assigning the minimal constant to every variable. 

To the best of our knowledge the problem of simple right-ground KBO con- 
straints has never been studied before. We are also not aware of any implemen- 
tation of a KBO constraint solving algorithm. The paper is now organized as 
follows: In Sect.3 we prove the NP-completeness of this problem and present 
an algorithm to solve it. In Sect.4 we study the complexity of variants of this 
problem including alternating constraints. We also consider a non-strict, partial 
ordering based on symbol counting and weaker than a KBO. The algorithm for 
right-ground constraints is extended to alternating constraints. In Sect. 5 we put 
the algorithm developed in Sects. 3 and 4 to practice and end the paper with a 
discussion of the obtained results, Sect. 6. 


2 Preliminaries 


In the following let X be a signature, i.e., a finite set of function symbols. Every 
function symbol f has an associated arity which we denote by arity( f). Function 
symbols c with arity(c) = 0 are called constants. We denote the set of all terms 
by T(X, æ) where ¥ is an infinite set of variables. Vars(t) denotes the set of 
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variables occurring in the term t. A term t is called ground if it contains no 
variables, i.e., Vars(t) = Ø. The set of all ground terms is denoted by T(X). 
We assume that X contains at least one non-constant function and at least one 
constant, i.e., that T(X) is infinite. For otherwise, constraint solving becomes 
trivial. A substitution is a mapping o : X — T(X, X) such that o(x) # x for 
only finitely many x € æ. The application to of a substitution ø to a term 
t € T(X, X) is defined in the usual way. We call a substitution grounding for 
some term t € T(X, X) if to is ground. A substitution o is a matcher from s 
to t if so = t. We consider the following version of the Knuth-Bendix ordering 
(KBO) on ground terms: 


Definition 1 (KBO on Ground Terms [9]). Let > be a strict total ordering 
(a precedence) on X, and w: X — N* a weight function. w is extended to terms 
recursively by w(f(ti,---,tn)) = w(f) +Z; w(t). The Knuth-Bendix ordering 
>xBo induced by > and w is defined by s>xpot iff 


1. w(s) > w(t), or 

2. w(s) = w(t), and 

(a) s= f(S1,..-,5m), t= g(t,.--,tn) and f > g, or 

(b) s= GEE Sm); t= Fliseestm) and (Sises m) >So (Gpeastas 


In particular, the precedence is strict and total, no unary function f with 
w(f) = 0 is allowed and all weights are natural numbers. It can be shown 
that >xgo is a strict, total and well-founded ordering on ground terms. In the 
following, we simply write > for >Kgo. 


Definition 2. A KBO constraint C is a finite set of atoms t#s where t,s € 
T(X, X) and # € {<,>,4,<,>,=}. We say that C = {t1#151,...,tn#nSn} is 
satisfiable if there exists a substitution o that is grounding for all tj, s; such that 


n 
VAN tjo #j S70. 


j=l 
Such a grounding substitution o is called a solution. 


Definition 3. A right-ground KBO constraint C is a KBO constraint where 
81,+++,5n E T(X), i.e., only the t; may contain variables. 


Definition 4. A simple right-ground KBO constraint C is a right-ground KBO 
constraint where # € {<,F}. 

For simple right-ground KBO constraints, we prefer more explicit notation: 
We now assume ti,...,tn,li,---,lm E€ T(X, X), $1,---,8n,T1,---;Tm © T(X) 
and call C satisfiable if there exists a substitution o that is grounding for all 
t;,1; such that 


84 Y. Briefs et al. 


3 Simple, Right-Ground KBO Constraints 


We start by investigating the complexity of simple, right-ground KBO constraint 
solving. 


Proposition 5. Checking satisfiability for simple right-ground KBO constraints 
is NP-hard. 


Proof. We reduce from MONOTONE 3SAT which is NP-complete by [7]. Let 
NW M be a set of clauses where N consists of the clauses with only positive 
literals and M consists of the clauses with only negative literals. We consider a 
signature with a constant a, a ternary function f and a unary function g. We use 
a KBO instance where all weights are 1 and f > g > a. For every propositional 
variable P occurring in N W M, we introduce a variable zp. Then the equation 
xp =a stands for P is true and xp Æ a stands for P is false. 

Now every positive clause (P V Q V R) € N is encoded as an inequation 
f(tp,2Q,tr) < f(g(a), gla), g(a)). Obviously, this inequation can only be sat- 
isfied by a grounding that maps at least one of these variables to a, i.e., that 
sets at least one of P,Q, R to true. 

Every negative clause (~P V =Q V =R) € M is encoded as an inequality 
f(tp,2Q,tr) 4 f(a,a,a). Obviously, this can only be satisfied if not all of these 
variables are mapped to a, i.e., if at least one of P,Q, R is false. 

Now the clause set has a solution iff there is a solution to the constructed 
simple right-ground KBO constraint. Assume NW M is satisfiable by a valuation 
3. Then for every propositional variable P map zp to a if G(P) = 1 and to g(a) 
otherwise. As explained above, this grounding will satisfy the constraint. Now 
let ø be a solution to the constraint. Then the valuation 8 where (P) = 1 if 
o(ap) =a and (P) = 0 otherwise satisfies Nw M. 

We have added |M| inequalities and |N] inequations which can be constructed 
in polynomial time, so the reduction works in polynomial time. 


Proposition 6. Checking satisfiability for simple right-ground KBO constraints 
is in NP. 


Proof. Let C = {t1 < 81,...,tn < 8n,4, #171,.--;ln Æ Tm} be a constraint. 
If for some inequality l; A rj, there is no matcher from l; to rj, we can ignore 
this inequality since it is true for every grounding. If for some inequality l; 4 rj, 
it actually holds that 1; = rj, then this inequality is impossible to satisfy, so 
we are done. After sorting out these two cases, as r; is ground, every inequality 
l; Ar; has a unique matcher 7; which has linear size with respect to rj. In the 
following, we say that the term 7;() is restricted by the inequality l; A rj. The 
inequality l; A r; then signifies 


Vo ale) žy) 
zE Vars(l;) 


For the inequations tj; < sj, it is obviously optimal to assign the smallest 
possible term to every variable. Larger terms only have to be considered due to 
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the inequalities |; A rj. If there is a grounding o that satisfies t; < sj, then any 
grounding o’ with o’(#) < a(x) for all variables x satisfies t; < sj. Hence, if 
there exists a solution, then there also exists a solution that only uses the m+ 1 
smallest terms for every variable. This is because every inequality l; A rj; only 
restricts at most one term for every variable, so for every variable the m+ 1 
smallest terms contain the smallest term that is not restricted for that variable. 

As we only have to consider the m + 1 smallest terms for every variable, the 
size of the groundings we have to consider is polynomially bounded by the input 
size. Let f be the function with the maximal arity and let p = arity( f). Let a 
be the smallest constant. We claim that every of the m + 1 smallest terms has 
at most mp + 1 symbols. Proof by contradiction: Assume to is one of the m + 1 
smallest terms with #to > mp + 1. Perform the following m times: Obtain tj41 
by replacing any subterm g(s1,...,S,), where the s; are constants, by a. The 
number of symbols decreases by at most p, so #t; > (m—i)p+1. As none of the 
t; is a constant, such a subterm always exists. After m steps, we obtain terms 
to >t) > +++ > tm with #tm > (m — m)p + 1 = 1, i.e., tm is not a constant, so 
tm > a. This contradicts the fact that to was one of the m + 1 smallest terms 
since at least m + 1 terms are smaller than to. Thus, we can guess a grounding 
and check in polynomial time whether it is a solution. 


Next we propose an algorithm for testing satisfiability of simple right-ground 
KBO constraints. Of course, by Proposition 6, there already exists an algorithm, 
but we expect that the following algorithm performs better in practice. Let C 
be a simple right-ground KBO constraint with n inequations t; < s; and m 
inequalities l; A rj. 

Assume that Vars({t;|1<j<n}U{l,|1<j<m}) = {z1,..., £k} As 
explained in the proof of Proposition 6, we only have to consider the m + 1 
smallest terms for the grounding, so to begin, we generate an ordered list S of 
the m+1 smallest terms. This way, a grounding substitution o corresponds to a 
vector Y € N* where v; < m + 1 is the index of the term o(z;) in S, i.e., S[uj] = 
o(a;). Let o(v) with o(v)(a;) := S[v;] denote the grounding corresponding to the 
vector V. Later on, we give a dynamic programming algorithm to compute the 
k smallest terms for some number k. Actually, we do not directly generate the 
m + 1 smallest terms, but start with a constant number of terms and generate 
more terms as needed. 

The algorithm is given by three inference rules that are represented by an 
abstract rewrite system. They operate on a state which is either L or a four-tuple 
(T; 0; F; C) where T is a sequence of variables, the trace; v € N* is a grounding 
substitution in vector notation, the current grounding; F is a set of forbidden 
groundings; and C is a simple right-ground KBO constraint. The initial state 
for a constraint C is (€;(0,...,0);0;C), i.e., the trace is empty, every variable is 
mapped to the smallest constant and there are no forbidden groundings. 

We use the following partial ordering <p on groundings: @ <p ù iff for all 
i € {1,...,k} we have v; < u;. By inc(v,i) we denote the grounding «” with 
v; = v; + 1 and v; = v for alll € {1,...,k} with l Æ i, i.e., the grounding where 
we increase the term for the variable x; by one. Analogously, we define dec(v, i), 
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where we instead decrease the term for the variable x; by one, i.e., vj = vi — 1. 
The two operations inc and dec are only used when they are well-defined, i.e., 
they yield a grounding v € N! where v; < m+1. The operation inc is only used 
when an inequality l; Æ r; is not satisfied, and this can happen at most m times 
without intermediate Backtrack steps. The operation dec(v,i) is only used for 
Backtrack, and by Lemma 15, in this case v; > 0. 

The role of F is that we want to keep the algorithm from considering wrong 
groundings again. For all ŭ € F, we do not visit states with grounding v if 
U >p Ü. When we Backtrack, we insert the current grounding into F. The trace 
T records the last updated variables so Backtrack is able to undo the last Increase 
operation. As will be proven in Theorem 18, the algorithm terminates in L iff 
there exists no solution, and if there exists a solution, then it terminates in a 
state where the current grounding v is a solution. 


Increase (T;0;F;C) =xes (Tz; 0"; FC) 
provided V = inc(¥, i), ljo(v) = rj for some l; Æ rj € C, ljo (U) # rj and there 
is no U E F with Y >p ù 


Backtrack (Tr; 0; F;C) =>xgcs (Tv; FU {0}; C) 
provided v = dec(v, i) and either 


1. ljo(U) = rj for some l; # r; € C, but for all L € {1,...,k}, we have that 
L,o(inc(v,l)) Æ rj implies that there is a @ € F with inc(v,l) >p U, or 


> 


2. t;0(v) > sj for some tj < sj E C 
Fail (e0; F;C) =xcs L 


provided either 


1. Lo(v) = r; for some 1; 4 r; € C, but for all l € {1,..., k}, we have that 
ljo(inc(ù, D) Ai Hapi that there is a € F with nalt, l) >p ü, or 
2. tja(t) > sj for some tj < sj E C 


Informally, Increase is applicable if some inequality l; A r; is not fulfilled and 
we can fix this with the new grounding inc(v,i) which is not forbidden by F. 
Backtrack undoes an operation and is applicable if either some inequality l; Æ rj 
is not fulfilled, but Increase is not applicable, or if some inequation tj < sj is not 
fulfilled. Fail is applicable if Backtrack would be applicable on an empty trace, 
i.e., there is no operation to undo. 

Obviously, there is no state on which we can apply both Backtrack and Fail. 


Definition 7. A reasonable strategy is a strategy that prefers Backtrack and Fail 
over Increase. 


KBO Constraint Solving Revisited 87 


Example 8. Consider a signature with constants a,b,c and a binary function f. 
We set w(a) = 1; w(b) = w(c) = 2; w(f) = 38 and a < b < c < f. We consider the 
constraint 


C= {zı # a, f (x1, £2) < f(a, c)}. 


The m+ 1 smallest terms, where m = 1, are a,b. This is the unique execution 
of the algorithm. In order to increase readability, for v, we write the terms instead 
of the indices. 


(E; (a, a); 0; C) 
S (x1; (b, a); 0; C) 
e (E; (a, a); {(b, a)}; C) 
>ë 1 


The algorithm terminates in L, so there is no solution. 


Example 9. Consider a signature with constants a,b, a binary function g and a 
ternary function f. Let w(a) = 1,w(b) = w( f) = w(g) = 2 and a ~ b xg < f. 
The constraint is 


C= {zı < b, g(£2,a) < g(b, b), f (£1, £2, £3) # f(a, 4, a), g(£1, 2) # g(a, b)}. 


The m+ 1 smallest terms, where m = 2, are a,b, g(a, a). 


(e; (a, a, a);0;C) 
= Increase (x1; (b, a, a); 0; C) 
= Backtrack (e; (a, a, a); {(b, a, a) }; C) 
= Increase (x2; (a, b, a); {(b, a, a)}; C) 
= Increase (x2%9; (a, g(a, a), a); {(b, a, a)}; C) 
= Backtrack —ěć (v3; (a, b, a); {(b, a, a), (a, g(a, a), a)}; C) 
Backtrack (e; (a, a, a); {(b, a, a), (a, g(a, a), a), (a, b, a) };C) 
= Increase (x3; (a, a,b); {(b, a, a), (a, g(a, a), a), (a,b, a)};C) 


The algorithm has found a solution, so no rule is applicable and it terminates. 
Note that after the third and fifth operation, we cannot increase xı because 
(b,b,a) >p (b,a,a) € F. 


Next we prove the correctness of the algorithm. 


Lemma 10. Jf (e; (0,...,0);0; C) Skog (T; v; F;C), then there is not € F 
with y >F ü. 
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Proof. We prove this by induction on l. For l = 0, this holds since F = Ø. For 
l > 0, the last applied rule must have been either Increase or Backtrack. If the 
last applied rule was Increase, then there cannot be such a wu because Increase 
does not modify F and because this is part of the condition of the Increase 
rule. Now assume the last applied rule was Backtrack, so the previous state was 
(Txi; 0; F';C) with Y = dec(v,i) and F = F’ U {v}. If there was some t in F 
such that V’ >p ü, then, since Y <p U, we have Y >p t. Hence, by the induction 
hypothesis, ť ¢ F’, so as F = F’ U {v}, it must hold that t = y, contradiction 
to >p wu. 


Lemma 11. Jf (e; (0,...,0);0;C) See (T; t; F; C) Syne (T'; U; F'; C) for 
l>0, then ü Aw or FFF’. 


Proof. If all l rule applications are applications of the Increase rule, then clearly 
ü <p U, so in particular, ù 4 u. There is no rule that removes elements from 
F, so F C F”. If there is at least one application of the Backtrack rule among 
the l rule applications, the current assignment Uv is added to F, and by Lemma 
10, y ¢ F, so F is modified and F £ F”. 


Proposition 12. > xcs is well-founded, i.e., the algorithm always terminates. 


Proof. By Lemma 11, we can reach every combination of v and F at most once. 
For @, there are (m + 1)* possibilities. We only add occurring groundings to F, 
so the number of possibilities for F is upper bounded by the number of subsets 
of all possible groundings which is g(m+1)" Thus, the number of reached states 
is finite (it is at most (m+ 1)Fa(m+1)*), so the algorithm terminates. 


Of course, the upper bounds in the proof of Proposition 12 are far too high 
and the algorithm will run much faster in practice. 


Lemma 13. Jf (¢;(0,...,0);0;C) Skog (T;0;F;C) and t € F, then for all 
i’ >p Ù it holds that u’ cannot be a solution. 


Proof. The proof is by induction on l. For l = 0, we have F = 9, so this holds. For 
l > 0, if the last applied rule was Increase, the statement follows by the induction 
hypothesis since F is not modified. Now assume that the last applied rule was 
Backtrack. Let (T’; v’; F’; C) be the previous state. We only have to show that all 
u’ >p V cannot be solutions, for all other elements of F = F’U {v’}, this follows 
by the induction hypothesis. First assume that Backtrack is applicable because 
of condition (1). Then v cannot be a solution since l;o(v") = rj. For U >p v, if 
Lo(u’) = rj, then w’ clearly cannot be a solution. Otherwise, there is a variable 
x; such that ti’ >p inc(v’,2) and lja(inc(v’,2)) # rj. However, it is part of 
condition (1) that then, there is an element U” € F’ with U” <p inc(v’,1) <p wv, 
so by the induction hypothesis, w’ cannot be a solution. If Backtrack is applicable 
because of condition (2), then tja(v”) > s; for some i € {1,...,n}. Clearly, if 
U >p V, then also tjo(u’) > si, so wu’ cannot be a solution. 
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Corollary 14. If (€;(0,...,0);0;C) >kog (T;0;F;C) and condition (1) or 
condition (2) of Fail is fulfilled for (T; 0; F), then for alld >p U, ü cannot be a 
solution. 


Proof. The conditions for Fail are the same as the conditions for Backtrack, so 
this follows by the proof of Lemma 13. 


Lemma 15. If (e; (0,...,0);0; C) Skog (T; 0; F; C), then for alli € {1,...,k} 


the number of occurrences of x; on the trace T equals vi. 


Proof. In the following, we denote the number of occurrences of x; in T by 
C(T, xi). The proof is by induction on l. If l = 0, the statement trivially holds. 
For | > 0 let (T’; 0’; F’;C) be the previous state. If the last applied rule was 
Increase, then T = T’x; and y = inc(v’,7), so 


C(T, x;) = C(T",2;) +1 u v; +1 = Y. 


For j # i, C(T,x;) = C(T', xj) and vj = vj, so the statement follows by the 
induction hypothesis. If the last applied rule was Backtrack, then T’ = Tx; and 
y = dec(v’, i), so 


C(T, xi) = C(T',xi)— 1 u v; —1= vi. 


Again, for j # i, C(T, xj) = C(T",x;) and vj = vj, so the statement follows by 
the induction hypothesis. 


Lemma 16. If (e; (0,...,0); 0; C) Skog (T; 0; F; C) > hai, L, then there exists 
no solution. 


Proof. Since Fail is applicable on (T; v; F; C), T = £, so by Lemma 15, 0 = 
(0,...,0). Hence, by Corollary 14, for all @ >p (0,...,0), U cannot be a solution, 
so there exists no solution. 


Lemma 17. If (¢;(0,...,0);0;C) Skog (T; 8; F;C) and no rule is applicable 
on (T; 0; F; C) then v is a solution. 


Proof. Assume that for some j € {1,...,n}, we had t;o(v) > sj. Then, either 
Backtrack or Fail would be applicable. Now assume that for some j € {1,...,m}, 
we had l;o(v) = rj. Then, either Increase or Backtrack would be applicable. 


Theorem 18. The algorithm is correct: If there exists a solution, then starting 
from (e;(0,...,0);0;C), the algorithm terminates in a state (T;v;F;C) where 
v is a solution. If there is no solution, the algorithm terminates in L. 


Proof. Follows by Proposition 12, Lemma 16 and Lemma 17. 


We have implemented the above algorithm in the context of the SPASS 
reasoning workbench. The efficiency of the algorithm depends on the respective 
variables we choose for Increase. If there exists a solution, then there exists an 
execution using only the rule Increase. The following criteria might be useful to 
select the best variable for Increase: 
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— We prefer variables that do not occur in “critical” inequations, or in a min- 
imal number of inequations. A “critical” inequation is one where the weight 
difference is 0 or close to 0. 

— We prefer variables x; for which the next term is not restricted by any inequal- 
ity lj Æ Tj. 

— We prefer variables x; for which the next term does not have a larger weight, 
or for which the increase in weight is minimal. 

— We prefer variables that fix multiple inequalities l; # rj instead of just one. 


It is possible to calculate and maintain some score for every variable here and 
decide based on this score. The exact selection criteria still need to be further 
explored. 

A remaining problem from the presentation of the algorithm is how to com- 
pute the k smallest terms. If the occurring weights are rather small, the following 
dynamic programming algorithm might be useful in practice. The idea is to com- 
pute all terms of a specific weight for increasing weights until we generated at 
least k terms. Unfortunately, there may be exponentially many terms of a spe- 
cific weight where the exponent is the maximal arity of a function and the base 
is the number of terms of smaller weights. However, k is bounded above by the 
number of inequalities m, the number of terms with smaller weights is bounded 
above by k and the maximal arity is probably small, so it is to be expected that 
this is not a big problem. 

As it is probably hard to find the next possible weight, we simply always 
increase the weight by 1 starting by the weight of the smallest constant. Our 
DP array is two-dimensional, one dimension having the weight and the other 
dimension having the size of the tuple from 1 to maz_ arity. Actually, it is four- 
dimensional since every entry is a list of tuples of terms and every tuple is a list 
of its entries. A tuple of size 1 is just a term of the specific weight. The tuples 
of larger size are needed for the DP transitions where they serve as argument 
tuples for the functions. We maintain an array smallest_terms that will in the 
end contain the at least k smallest terms. 

We iterate over the weights starting at the weight of the minimal constant. 
Let curweight denote the current weight. The idea is to compute all terms of 
weight curweight, sort them, add them to smallest_terms, and proceed with 
weight curweight + 1 if |smallest_terms| is still smaller than k. To do so, if 
curweight is not the smallest weight, we first compute the tuples of size 2 to 
max_ arity for the previous weight. This is done via DP: For tuple size i we iterate 
over the terms s € smallest_ terms. Then we iterate over the tuples t of size i— 1 
and weight curweight— 1 -— w(s) using the DP array and add (s, t) to the current 
DP entry. Afterwards, we calculate all terms of weight curweight by iterating 
over all symbols f and all tuples t of size arity( f) and weight curweight — w( f) 
using the DP array. Then, the term f(t) has weight curweight. 

We finish this section by a discussion of potential heuristics, sufficient condi- 
tions for a simple right-ground KBO constraint to have a solution. As explained 
before, every inequality 1; A r; rules out any assignment that satisfies 7j, the 
matcher from l; to r;. Now assume we have m inequalities and know that there 
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are more than m solutions for the inequation t < s, then one might think that 
there is a grounding that solves all inequalities l; 4 r; and the inequation t < s. 
However, this is not true. 


Example 19. Consider a signature with constants a, b and c and a binary function 
f. The weights are w(a) = 1; w(b) = w(c) = 2; w(f) = 3 and we use a x b < c < 
f as a precedence. Now consider the constraint 


C = {x #4 a, f(x,y) < f(a,c)}. 


The inequation has two solutions, namely {x > a, y > a} and {z => a,y |> b}. 
However, it has no solution where x is not mapped to a, so for the overall 
problem, there is no solution. 


So the above sufficient condition needs to refined in order to be correct. 
However, calculating the number of solutions is again NP-hard. 


Proposition 20. Calculating the number of solutions o for some right-ground 
inequation t < s is NP-hard. 


Proof. We reduce from the Unbounded Subset Sum Problem (USSP) which is 
NP-complete by [7]. Let s1,..-, Sn, T € N+. We have to find out whether there 
are £1,..., 2n € N such that X; zis; = T, i.e., whether there is a multiset of 
values from {s1,..., Sn} that sums up to T. Assume we had an oracle that could 
compute the number of solutions for any inequation | < r where r is ground. We 
will use this oracle twice. 

For both uses, we use a signature with constants c and d and unary functions 
fis---, fna. We have w(c) = 1, w( fi) = s; for i € {1,...,n} and d = cx fi < 
+- < fn. For the first case, set w(d) = T+2. Using the oracle with the inequation 
x < d, we get the number of terms smaller than d. Since d is the smallest term 
of weight T + 2, this is exactly the number of terms with weight < T + 1. For 
the second case, set w(d) = T + 1. Again, using the oracle with the inequation 
x < d, we get the number of terms smaller than d. This time, this is the number 
of terms with weight < T. If we now subtract those values, we get the number 
of terms with weight exactly T + 1. 

Now the USSP has a solution iff the number of terms with weight exactly 
T +1 is not 0. Every term t of weight T +1 must have the constant c as subterm 
since the weight of d is too large. The rest of t must consist of the unary functions. 
Hence, the weights of the unary functions used sum up to T + 1 — 1 = T. Since 
the weights of the unary functions correspond to the numbers from the USSP, 
this yields a solution for the USSP. Conversely, given a solution to the USSP, we 
can construct a term of weight T + 1 analogously. 


The problem with the aforementioned insufficient condition is that an 
inequality 1; # r; does not necessarily rule out only one grounding, but pos- 
sibly infinitely many groundings. This happens if there are variables that are 
not restricted by the matcher 7; of lj and rj. However, the criterion can be 
refined to a correct sufficient condition. If we restrict ourselves to the m + 1 
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smallest terms again, we would again at least have a finite number of ground- 
ings that 1; # r; rules out. If we now sum up these numbers over all inequalities, 
we have an upper bound on the total number of ruled out groundings. For the 
inequation t < s, the same problem with variables that do not occur arises (there 
may be infinitely many solutions), so here, we restrict ourselves to the m + 1 
smallest terms again. If now, the number of solutions for t < s is larger than the 
upper bound on the total number of ruled out groundings, we can actually be 
sure that there is a solution. However, this correct sufficient condition is hard to 
compute and therefore seems to be not very useful in practice. 


4 Further Constraint Variants and Ordering Relaxation 


In this section we study further variants of constraint problems and eventually 
extend the algorithm of Sect.3 to alternating KBO constraints. 


Proposition 21. Checking satisfiability for right-ground KBO constraints 
restricted to strict inequations is NP-hard. 


Proof. The proof strategy is the same as the one used in the proof of Proposition 
5. The encoding for positive clauses stays the same as < is still allowed. For 
negative clauses —P V =Q V =R we encode them as f(tp,%g,tR) > f(a,a,a). 
This inequation can only be satisfied by a grounding that does not map all of 
these variables to a, and is trivially satisfied by any such grounding. 


In particular, we have seen that only having constraints of the form t; < s; 
and t; > s; suffices to make the problem NP-hard. Next we turn to a weaker term 
ordering <sym solely based on symbol counting. Even for this ordering constraint 
solving remains NP-hard. 


Definition 22. For ground terms t,s € T(X), we define t <sym s : <=> 
|sym(t)| < |sym(s)|, i.e., t does not contain more symbols than s. 


Definition 23. A right-ground symbol constraint C is a finite set of atoms t#s 
with t € T(X, XV), s E€ T(X) and # € {<sym, A}. Satisfiability is defined analo- 
gously to the satisfiability of KBO constraints. 


Proposition 24. Checking satisfiability for right-ground symbol constraints is 
NP-hard. 


Proof. The proof strategy is the same as the one used in the proof of Proposition 
5. We encode positive clauses PV QV Ras f(xp,xo, xr) < f(g(a), g(a), a). The 
only way to satisfy this inequation is to map at least one of these variables to a. 
Negative clauses ~P V =Q V -R are encoded as f (£p, £Q, £R) # f(a,a,a). 


In particular, the NP-hardness of these problems is not caused by the com- 
plicated structure of the KBO since the problem is already NP-hard for a com- 
parison as simple as counting the number of symbols. 
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Our next variants are motivated by the definition of congruence classes with 
respect to terms with variables. For the first variant, all instances of the defining 
term t have to be smaller than a single ground term ĝ and different from ground 
terms 51,...,8n. 


Definition 25. A simple, single ground KBO constraint C consists of terms 
tET(X, X) and s1,..., Sn, 8 E T(X). We say that C is satisfiable if there exists 
a substitution o that is grounding for t such that 


\\ to #8; Ata <B. 


j=l 


Proposition 26. Assuming that we are given the n+1 smallest terms, checking 
satisfiability of simple, single right-ground KBO constraints is in P. 


Proof. Actually, for this problem, if a reasonable strategy (Definition 7) is used, 
the algorithm from Sect.3 runs in polynomial time. The key difference to the 
other problems is that here, every variable occurs in every inequality, so every 
inequality rules out at most one grounding. We first show that we can only reach 
polynomially many states. First, consider states (T; v, F; C) where to(v) < £. 
If @ does not violate any inequality t Æ s;, then the algorithm terminates, so 
there is at most one such state. For every inequality t Æ s;, there is at most one 
grounding @ that violates it. We claim that we reach at most k + 1 states with 
current grounding y where k is the number of variables. y is reached at most 
once using Increase because otherwise, there must be an intermediate Backtrack, 
so y would be inserted into F. If we reach v using Backtrack for some variable 
x;, then inc(v,i) was inserted into F, so for every variable x;, we can reach U 
using Backtrack at most once. Hence, v is reached at most k + 1 times. 

Now, consider states (T; v; F; C) where ta(v) > 8. Since a reasonable strategy 
is used, we must have reached this state from a state that does not violate t < p, 
so by the argumentation before, at most k such states can be reached for every 
inequality, so at most n - k in total where n is the number of inequalities. 

Hence, in total, there are at most (k + 1)-n+n-k+1 states. The state 
transitions can be done in polynomial time because we only need to iterate over 
all inequalities and inequations and over all entries in F. Since there are only 
polynomially many states and every rule application inserts at most one element 
into F, F has polynomially many entries. 


Definition 27 (Alternating KBO Constraint). An alternating KBO con- 
straint C consists of terms t, 81,...,8, E T(X, X) and B € T(X). We say that 
C is satisfiable if there exists a substitution o that is grounding for t such that 
for all substitutions T that are grounding for all s; we have 


n 
\\ to # 837 Ata < B. 
j=1 
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Proposition 28. Checking satisfiability for alternating KBO constraints is NP- 
hard. 


Proof. We reduce from SAT. Let N be a set of clauses and X1,..., Xp be the 
variables occurring in N. We use a signature with a k-ary function f, two con- 
stants a and b, variables x1,...,2, and y1,..., yz. Set t = f(x1,..., £k). Now for 
every clause C; € N we introduce an inequality f(x1,..., £k) Æ f(Sj1,---5 89,4) 
where we set sj; = b if X; occurs positively in Cj, sji = a if X; occurs nega- 
tively in Cj and sji = yi if X; does not occur in C;. The idea is that x; = a 
stands for X; is set to T and x; = b stands for X; is set to L. Vr.ajo Æ YT is 
obviously impossible to satisfy, so the inequality must be made true by setting 
some positive variable to a or some negative variable to b. 

To ensure that the x; are only mapped to a or b, we do the following: We first 
introduce a new constant c and set 8 = c. Then we set w(f) = w(a) = w(b) = 1, 
w(c) = k+2 and c <a ~ bx f. If all variables z; are mapped to a or b, we 
have w(to) = k +1, i.e., to < 8. Any grounding where some zx; is not mapped 
to a or b results in to > £. 

Now there is a solution ø iff there is a satisfying valuation for N. 


If a reasonable strategy is used, satisfiability of alternating KBO constraints 
can be checked using the algorithm from Sect.3. Any solution ø must be such 
that to < ß, so we only have to consider instances of the sjr with s;7 < P. 
What we can now do is to calculate for all s; all groundings r with s;7 < 8 
and add the inequality t Æ s;7 to the constraint. There are only finitely many 
such groundings because we did not allow unary functions f with w(f) = 0. 
This way, we obtain a simple right-ground KBO constraint, so we can apply 
the algorithm. A more efficient possibility to do this is to add the groundings of 
the sj implicitly, i.e., to change the condition of Increase (and the first case of 
Backtrack and Fail) to whether there exists a matcher 7 such that ljo (U) = rjr. 
Also, the condition for the next grounding for Increase changes: It is not that 
we fix the inequality anymore, but that we change a variable that occurs on the 
left side of the inequality. 


Example 29. Consider the signature X = { f, g), a}, where the superscript 
numbers denote the function arities, together with the following alternating KBO 
constraint C: 


t = f(x1, £2) sı = f(9(y1), y2) 
b= f(f(a,a),a) s2 = f(a,a) 


We set w(a) = w(g) = w(f) = 1 and a < g < f. The few smallest terms are 


a, g(a), g(g(a)), f(a, a). 


Note that for alternating KBO constraints, it does not suffice anymore to con- 
sider the n + 1 smallest terms only since an inequality may rule out more than 
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one term for a variable. However, as mentioned in Sect. 3, we calculate the small- 
est terms as needed, so this is not a problem. For shorter notation, for F, we 
omit groundings w if there is a grounding v € F with y <p wu. A possible run of 
the algorithm looks as follows: 


(e; (a, a); 0; C) 
>Kosş™ (x1; (g(a), a); 0; C) 82,7 = {} 
Kes (z121; (9(9(a)), a); 0; C) s1, T = {y1 | a, y2 a} 
Spas (xıx171; (f(a, a), a); 0; C) sı, T = {yi |> g(a), y2 > a} 
=>pkos ™ (z121; (glgla)), a); {(f(a,a),a)}; C) B 
Kos" (a1; (g(a), a); {(9(9(@)), a)}; C) sı 
=Kes ?™ (e; (a, a); {(g(a), a)}; C) s1 
=>KoS™ (x2; (a, gla)); {(g(a), a)}; C) 82,7 = {} 


5 Experiments 


We implemented the algorithm of Sect.3 and its extension to constraints with 
right hand side variables, Definition 27, and tested it in the context of an 
extended congruence closure (CC) algorithm with variables [6,8,15,16]. We 
implemented a rather naive variant of [8] with the only goal to generate KBO 
constraints in order to test our new algorithm on KBO constraints. In contrast 
to [8] our algorithm considers a finite signature, as usual for first-order logic 
problems. All experiments were carried out on a Debian Linux server equipped 
with AMD EPYC 7702 64-Core CPUs running at 3.35GHz and an overall mem- 
ory of 2TB. The result of all runs as well as all input files and binaries can be 
found at https: //nextcloud.mpi-klsb.mpg.de/index.php/s/BAwd99cxFpSJmSp. 

As a first test case we considered all eligible UEQ problems from CASC- 
J11 [17]. We consider equations and all inequalities for the congruence closure 
algorithm. The equations generate the congruence and for the inequalities we 
compute the congruence classes for the respective right and left side term of the 
inequality. For each example, the KBO function weight was always set to one and 
the precedence is generated with respect to the occurrence of symbols in the input 
file in ascending order. For 8 we chose a fixed nesting depth of 4 and build for each 
input file a nested term of exactly this depth using function symbols in the order 
of occurrence in the input, starting with a non-constant function symbol. Out 
of all eligible problems our CC algorithm terminated on 186 problems within 
a time limit of 30 min. Please note that although our CC implementation is 
rather naive, in contrast to the classical ground CC algorithm it does not need a 
complete grounding; for the examples where our naive algorithm runs out of time 
a complete grounding is not affordable. The below table shows some typical runs 
on the UEQ domain. All timings are presented in hundredths of a second and 
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if they take less than one hundredth of a second we write zero. The below table 
shows the problem name, the number of ground terms smaller than @ indicating 
the solution space for the constraint, the summed up time of all calls to the KBO 
constraint solver during the CC run, the number of calls to the KBO constraint 
solver, and the results of these calls. The three selected examples are typical: 
most of the problems are satisfiable and the constraint solving algorithm needs 
almost no time. Note that for the first example all 8014 calls to the constraint 
solver needed in sum 3 hundreds of a second. The LAT143-1 is the example 
showing the worst constraint solving performance, i.e., still less than a hundreth 
of a second per call. 


Problem |< 8 Time KBO Constraint Solver | #calls | # true | # false 


GRP183-3 | 9969 |3 8014 | 7946 | 68 
LAT143-1 | 29720 | 8797 31033 | 29554 | 1479 
GRP409-1 | 103565 | 0 6 6 0 


For the SMT-LIB examples of the UF domain [1], we expanded let operators, 
removed the typing, coded predicates as equations, did a CNF transformation 
and then took the first literal of each clause as input for the CC algorithm. 
Nesting depth was set to 2, the rest done as for the UEQ examples. Removing 
types means that the number of smaller terms increases, i.e., the problems get 
potentially more difficult for the constraint solver, in particular for unsatisfiable 
constraints. The below table again shows some typical results. 1112 examples 
could be performed by the CC algorithm inside 30 min. The UF domain contains 
larger examples compared to the UEQ domain, but the characteristics remain. 
Constraint solving itself takes almost no time. Again all timings are presented 
in hundredths of a second. 


Problem | <6 Time KBO Constraint Solver | #calls | # true | # false 
00336 2120806 
uf.926761 138397692 | 0 


uf.555113 | 254939 


Here uf.555113 is the worst example on constraint solving time with 1.34s for 
5120 calls. Although alternating KBO constraint solving is NP-hard, in practice 
there are typically only a few inequalities meaning that out of the overall number 
of terms smaller 8, only a few need to be considered. 


6 Discussion 


We have studied a number of specific KBO constraint solving problems moti- 
vated by the SCL calculus and established their complexity. Except for simple, 
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single right-ground KBO constraints all studied problems are proven NP-hard. 
We propose an algorithm that eventually runs for alternating KBO constraints 
which include a quantifier alternation. The algorithm shows nice performance 
on benchmark problems. Our next step is to turn our naive CC implementation 
with variables into a robust algorithm. 


Acknowledgments. We thank our reviewers for their constructive comments that 
helped us improve the paper. 
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Abstract. The rewrite relation of a conditional term rewriting system 
(CTRS) can be divided into a hierarchy of rewrite relations of term 
rewriting systems (TRSs) by the depth of the recursive use of rewrite 
relation in conditions; a CTRS is said to be level-confluent if each of 
these TRSs are confluent, and level-confluence implies confluence. We 
introduce level-commutation of CTRSs that extends the notion of level- 
confluence, in a way similar to extending confluence to commutation, and 
give a critical pair criterion for level-commutation of oriented CTRSs 
with extra variables (3-CTRSs). Our result generalizes a criterion for 
commutation of TRSs of (Toyama, 1987), and properly extends a crite- 
rion for level-confluence of orthogonal oriented 3-CTRSs (Suzuki et al., 
1995). We also present criteria for level-confluence and commutation of 
join and semi-equational 3-CTRSs that may have overlaps. 


Keywords: Level-commutation - Level-confluence - Commutation - 
Confluence - Critical pair - Conditional term rewriting systems 


1 Introduction 


Confluence, which guarantees unique results of computations, is an important 
property of term rewriting systems (TRSs). Commutativity between two TRSs 
is a natural generalization of confluence in the sense that self-commutativity 
coincides with confluence. It also allows to infer confluence of TRSs in a modular 
way—the union of two confluent TRSs is confluent if they commute. 
Conditional term rewriting systems (CTRSs) are extensions of TRSs in which 
each rewrite rule can be equipped with conditions, where these conditions are 
supposed to be evaluated recursively using the underlying CTRS itself. Some 
type of CTRSs is known as a model of functional (and logic) programs. The 
underlying logic of TRSs is the equational logic, whereas the one of CTRSs is 
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called the quasi-equational logic, constituting also an important class of systems 
for reasoning on a wider class of algebras. 

From the computational point of view, the rewrite relation of a CTRS can be 
divided into a hierarchy of rewrite relations of TRSs by the depth of the recursive 
use of rewrite relation in conditions; a CTRS is said to be level-confluent if each 
of these TRSs are confluent. Suzuki et al. showed a criterion for orthogonal (i.e. 
left-linear non-overlapping) oriented CTRSs to be level-confluent [14]. Level- 
confluence implies confluence, and their result can be thought as a generalization 
of confluence of orthogonal TRSs. More crucially, since much fewer criterion 
have been obtained for CTRSs comparing to TRSs, level-confluence can be seen 
as an important approach to obtain confluence proofs of CTRSs. In contrast 
to TRSs, where many extensions of the orthogonality criterion for left-linear 
(possibly overlapping) TRSs to have confluence have been explored (e.g., [4,8, 
11,16]), similar extensions for CTRSs are not known. Similarly, several criteria 
for ensuring commutation for left-linear TRSs are known (e.g., [16,19]). Again, 
similar criteria for left-linear CTRSs are not known. In this paper, we give a 
criterion for a class of (possibly overlapping) left-linear oriented CTRSs, under 
which we prove level-commutation of such CTRSs. Our result is a generalization 
of the one given for TRSs in [16] and properly extends the result of [14] mentioned 
above. We also present criteria for level-confluence and commutation of left-linear 
join and semi-equational CTRSs that may have overlaps. 

The rest of the paper is organized as follows. In the next section, we fix 
some notions and notations used in this paper, and explain two results that give 
starting points of our work. In Sect.3, we present our main theorem on level- 
commutation of oriented CTRSs and its proof in detail, and explain relations 
to the previous results. We then give some results on join CTRSs and semi- 
equational CTRSs in Sect. 4. Section 5 concludes. 


2 Preliminaries 


We basically follow standard notions and notations (e.g., [8,10]). Below, we 
explain some key notions and fix notations that will be used in this paper, while 
omitting most of definitions of standard notions and notations. 

We consider a set F of function symbols. The set of variables is denoted by 
VY and the set of terms over F and V is by T(F, V). We sometimes specify a set 
C C F of constructors to give the set of constructor terms T(C, V), i.e. terms over 
C and V. The set of variables in a term t is denoted by V(t). A term t is linear 
if each variable occurs in t at most once; t is ground if no variable occurs in t. 
The size of a term t is denoted by |t|. The set of positions in a term t is denoted 
by Pos(t); the root position is written as e. The symbol at a position p € Pos(t) 
in a term t is written as t(p). We put Poss (t) = {p € Pos(t) | t(p) €E F}. 

If t = Clu], for a context C, we say u is a subterm of t (at a position 
p € Pos(t)). The subterm of t at a position p € Pos(t) is written as t|p. For 
terms t = Clu], and s, the term C[s], is denoted by t[s],. We speak of subterm 
occurrences when we consider subterms with their respective positions; see e.g. 
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[15] for a precise formalization of subterm occurrences. We will use capital letters 
A, B,... for subterm occurrences. For simplicity, a subterm occurrence A in a 
term is also treated as a term A (for example, we might write V(A)). Suppose 
A, B are subterm occurrences in a term t. If t = C[A], and t = C’|B], with p < q 
(p < q) we say that B is a (proper) subterm occurrence in a subterm occurrence 
A and write B C A (B C A, respectively). Overlaps on subterm occurrences will 
be used to give a notion of weight on which our induction proof works. 

A term rewriting system (TRS, for short) R is a set of rewrite rules, where 
each rewrite rule | — r satisfies the conditions | ¢ V and V(r) C V(I). Rewrite 
rules are identified modulo renaming. A TRS œR is left-linear if l is linear for each 
l — r € R. We write s >h t if s|p is the redex of this rewrite step; we also write 


s An t to indicate the redex occurrence A of this rewrite step. The relation 
—R over terms is called the rewrite relation of R, and its reflexive transitive 
closure is denoted by >. A reduction is a successive sequence of rewrite steps 
to >R ti SR ++: OR tn, where n is the length of this reduction. When no 
confusion arises, a reduction s >r --- > t is written as s >r t for brevity, 
whose length is denoted by |s Är t|. We have a parallel rewrite step s +p t if 
s =CTlA),..., An], t = C[Bi,..., Bn] (n > 0) for some context C and subterm 
occurrences A;,B; such that A; >R Bi for alli =1,...,n; this rewrite step is 


written as s a t to indicate the redex occurrences Aj,..., An- 

A relation > is confluent if © o > C 404; A TRS R is confluent if so is 
its rewrite relation >. Relations — and ~> commute (or, are commutative) if 
Hom Co; TRSs R and S commute if so do their rewrite relations >r and 
— sg. Clearly, selfcommutativity equals confluence, and from a sufficient criterion 
for commutativity the one for confluence naturally arises. 

Let lı — rı and l2 — r2 be rewrite rules so that their sets of variables 
are renamed to be disjoint. If a non-variable subterm [2|, of lo satisfies l2|,0 = 
lio for some substitution o, we say that lı — rı overlaps on l3 — ro (at p), 
provided that p Æ e for the case lı — rı and l2 — rz are identical. Suppose 
lı — rı overlaps on l2 — rp at p and ø is an mgu of lə|p and lı. Then the pair 
(le[ri]po, r20} is called a critical pair (obtained from that overlap); the pair is 
called outer if p = e and is called inner if p > e. The set of critical pairs from 
overlaps of rules of R is denoted by CP(R); the set of outer (inner) critical 
pairs are denoted by CP out(R) (resp. CPin(R)). Let R,S be TRSs. The set of 
critical pairs obtained from overlaps of lı —> rı E€ R on Ig —> r2 € S is denoted 
by CP(R,S). The sets CP ou(R,S) and CPin(R,S) are defined similarly. We 
are now ready to state a sufficient criterion for commutativity of TRSs. 


Proposition 1 ([16]). Let R and S be left-linear TRSs. If both of the following 


conditions are satisfied, then R and S commute: 


1. for any (p,q) € CP(R, S), ps ° Čr q, and 
2. for any (q,p) E€ CPin(S,R), q te R p holds. 


The above criterion for commutativity arises a criterion for confluence: a 
left-linear TRS R is confluent if (1) for any (p,q) € CPouw(R), PR © ČR q, 
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and (2) for any (q,p) € CPin(R), q =r p holds. Note here in the condition 
(1), considering (p,q) € CPou:(R) is sufficient, instead of considering (p,q) € 
CP(R), because of the presence of condition (2). 

A (directed) equation is an ordered pair (u,v) of terms, written as e.g. u X v. 
A conditional rewrite rule has the form l —> r < u1 & v1,..., Uk & Uk where 
1 £ V; here, u1 © v1,...,Uk © Ug is a sequence of (directed) equations, called 
the conditional part of the rule. Often, we will use a meta-variable, say c, to 
denote the conditional part of the rule. Let c = uy & vy,...,Ug © Ug. Then, 
for any given substitution o, we put co = U10 & V10,...,UkO & Uko. Also, we 
write e.g. V(I,c) to denote the set of variables occurring in l and c. We often 
also treat c as a set {u1 © V1,..., Uk © vg} so as to write uv Ec, co C ~, 
etc., whose meaning should be apparent. The empty sequence is also written as 
Ø, and 1 — r <4 is abbreviated as 1 > r. 

Conditional term rewriting system (CTRS, for short) is a set of conditional 
rewrite rules. In the literature, CTRSs are categorized into several types of 
CTRSs according the way of interpreting the conditions of the rules used in the 
definition of their rewrite steps. A rewrite step of oriented CTRS R is defined via 
the following TRSs Rn (n € N), which are inductively given as follows: Ro = 9, 
Rati = {la > rol|lor<=céeE Rc C Sr, }. A rewrite step s >r t of 
CTRS R is given as s >r t iff s >r, t for some n. Note that m < n implies 
>R, m E >R,- The smallest n such that s >r, tis called the level of the rewrite 
step s >r t. We also use the notation >r- = Uicn >R: We will also write 
Rn F co to denote co C ŠR. Except Sect. 4, we will only consider oriented 
CTRSs in this paper, and thus let us postpone to mention about join or semi- 
equational CTRSs until Sect.4. A CTRS R is level-confluent if TRSs R, are 
confluent for all n > 0. One can naturally extend the notion of level-confluence, 
in the similar way extending confluence to commutation. 


Definition 1 (Level-commutation). CTRSs R and S are level-commutative 


if for any m,n > 0, “R,, os, Cs, 0 ČR. 


Clearly, level-commutativity (level-confluence) implies commutativity (resp. 
confluence), and self-level-commutativity implies level-confluence. 

A conditional rewrite rule 1 > r < c has type 1 if V(r,c) C V(I), type 2 if 
V(r) C V(I), type 3 if V(r) C V(l, c), and type 4 if “true”. A CTRS R has type 
n if all rules have type n; CTRSs of type n are also referred to as n-CTRSs. We 
will mainly deal with 3-CTRSs below. Variables occurring in r,c which is not 
contained in V(l) are often called extra variables. 

We now explain some notions necessary to give a sufficient criterion for level- 
confluence [14]. A CTRS R is properly oriented if V(r) Z V(I) implies V(u;) C 
VL) UU i= Vlo) for all 1 < i < k, for any l —> r < u1 v,...,Up © Uk ER. 
A CTRS R is right-stable if, for all l > r < uy & v1,..., Uk S Uk E R, (1) 
WV) U (UFI V(uj,0;)) UV(us)) N V(vi) = Ø for all 1 < i < k and (2) for any 
1 < i < k, v; is either a linear constructor term or a ground R,,-normal form, 
where the constructors are given by C = F \ {l(e) | l > r <+ c € R} and the 
(extended) TRS R, is given by R, = {l > r |l —> r 4< ce R}. A CTRS R is 
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left-linear if / is linear for alll — r < c € R. Let lı —> rı & cı and lg > r2 & C2 
be conditional rewrite rules so that their sets of variables are renamed to be 
disjoint. We say l > rı & cı overlaps on l2 > rg < cs (at p) if a non-variable 
subterm lə|p of lz satisfies l2|,0 = lo for some substitution ø, provided that 
p Æ € for the case l > ry <= c and l2 —> r2 — cz are identical. A CTRS 
R is non-overlapping if there is no overlap between rules of R; A CTRS R is 
orthogonal if it is left-linear and non-overlapping. 


Proposition 2 ([14]). Let R be an orthogonal, properly oriented, right-stable 


3-CTRS. Then, Ž Rim OPR, E : Ry © č R,, for any m,n > 0. In particular, R 
is level-confluent. 


3 Level-Commutation of Oriented CTRSs 


Proposition 1 only deals with TRSs but its scope is not limited to orthogonal 
ones. On the other hand, Proposition 2 can deal with CTRSs (not only TRSs) 
but limited to only orthogonal case. Also Proposition 2 only claims on (level- 
)confluence, whereas Proposition 1 claims on commutativity. A natural question 
is whether we can unify these two propositions and how—we will focus on this 
question in the this section. 

Our basic idea is to unify proofs of |16, Theorem 3.1] and [14, Theorem 4.6]. 
The basic scenario of the former proof is showing that GH pots C 4H godt R. 
In the latter, an extended parallel rewriting GR, of ++ was introduced and 
they showed FR, 0 HR, C GR, 0 Rn- Naturally, our first attempt 


was to prove HR, O >S, C Hs, O4 i PR, n- Examining the details, however, 
it turned out that this scenario does not work (induction does not work). Thus, 
our first key ingredient is to modify our proof scenario as showing: 


* * 


PR, OFS, C Hs, O HSan OFPR, (x) 


We now reason why this scenario is sound using an abstract setting. 
Let (—>n)nen be an N-indexed family of relations on a set X. We put Sen = 
Uien ~i- We say (—n)nen is up-simulated if Ž <n © >n for any n E€ N. 


Lemma 1. Let (—>n)nen, (*n) nen be up-simulated families of relations on a set 
X. Suppose that!, for any m,n E€ N, —mor~rn C wn OM cnom. Then, for any 
m,n EN, we have (1) @mo~n C ~n O ~en Os 5 (2) Čm o~n C œn o m 


* * * * 
and (3) Hm On Cn Sm. 


Proof. Use induction. Use (1) to show (2), and then (2) to (3). 


1 The criterion has some similarity with the decreasing diagrams; however, because 
multiple —m-steps are allowed, it is not at all apparent (currently, to the authors) 
whether the criterion can be obtained via the decreasing diagrams. 
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Now let us adopt our abstract framework to CTRSs. Let R be a CTRS. The 
notion of extended parallel rewriting [14] is given as follows: we write s GH R, t 
if s = C[Aj,..., Ap], t = C[Bi,..., Bp] (p = 0) for some context C and subterm 
occurrences A;,B; such that either A; >R, Bi or A; ey <n Bi for alli = 
1,...,p. We put =r = U,s ch R,, which is called the extended parallel 


Ar,- A o 
rewrite step of R. We will also write s HGR t to indicate subterm occurrences 
AiyessgAp 
Then, from the Lemma 1, it easily follows: 


* 


Lemma 2. Let R,S be CTRSs. Suppose ~R, 9 HS, E HSn 9 HSan O 


x 

Č Rn for any m,n > 0. Then, for any m,n, we have =R, 0 =s, C Hes, © 
* * * * * 

FR,- Hence, for any m,n, we have —R,, 0 >s, Cs, 9 &R, 


y == 


Proof. Suppose tı ËR, t Š s, to. As >R, C cH R, for each k we have 
ty eR, t GR, t2 (and similarly for S). From the fact >r, C Or, for 
m < n, it immediately follows that (—>n)nen is up-simulated (again, simi- 
larly for S). Thus, it follows tı Hs, A HPR, t2 by using Lemma 1 and our 
hypothesis. Because PR, C +R, for each k (and similarly for S), we obtain 


ty Ss, V Or,, te. 


It is now concluded from this lemma that our proof scenario (*) works to 
obtain the level-confluence. 

For our proof below, we need to use the induction hypothesis to claim a 
more general statement as in the above. The following lemma is presented for 
this purpose. 


Lemma 3. Let R,S be CTRSs and k E€ N. Suppose =R, 0 mes, C Gh sg, 0 
e Sen © “rR, for any m,n such that m+n < k. Then, for any m,n such 
that m+n < k, we have (1) =R, o Sn C H S, OH Sen 0 GHP Ry (2) 
* * * * 


* * * 
4PR,, O s, C HPs, 0R, and (38) 4R,, 0s, C HHS, OHOR, 


Proof. Use an abstract version of the lemma, which can be proved in the way 
similar to Lemma 1. 


Our second key ingredient is the following alternative definition of conditional 
critical pairs. 


Definition 2 (Condition-separated CCP). Suppose lı > rı & cı overlaps 
on lg > r2 = cg at p and o is an mgu of l2|, and lı. Then the quadruple 
(le[ri]po, r20) <= (c10, c20) is called a (condition-separated) conditional critical 
pair (CCP, for short) (obtained from that overlap); when p = «€, the pair is 
called outer and p > €, the pair is called inner. The set of (outer, inner) critical 
pairs obtained from overlaps of l —> rı = cı E R on l2 — ro & C2 E S is 
denoted by CCP(R, S) (resp. CCP oal R, S), CCPin(R,S)). The set of (outer, 
inner) critical pairs from overlaps of rules of R is denoted by CCP(R) (resp. 
CCP oal R), CCP in(R)). 
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In most literature, we see that instead of distinguishing two sequences c10 
and c20, the combined sequence of cjg and cc is employed in the definition of 
CCPs. But, in our case where CTRSs R and S may be different, this distinction 
is important to state a precise condition of our theorem. 

We now present one more preparation: the following lemma is used several 
times as a part of the proof of our main theorem—when the lemma is used in the 
proof of our main theorem, the assumption (f) of the lemma can be inferred from 
the induction hypothesis (of the proof of the main theorem), using Lemma 3. 


Lemma 4. Let R and S be 8-CTRSs and suppose that R is left-linear and right- 


stable. Suppose that M = lo, N = ro, Rm-1 F co with l — r <= c E€ R. Assume 
Py,...,P af athe 
moreover that M “44> Sa, P and P,,...,Pp occurs in the substitution o. Assume 


that G) der, ot oe i S; 2< i >R; for any i,j such thati +j < m+n. 
Then, there exists Q such that N 5, Q and P >r, Q. 


Now we present our critical pair criterion for commutativity. 


Theorem 1. Let R and S be left-linear, properly oriented, right-stable 3- 
CTRSs. If the following conditions are satisfied, then R and S are level- 
commutative: 


1. for any (u,v) = (c,d) € CCP(R,S), m,n > 1 and substitution p, if cp C 
* ' * * * 
>R ıı and d'p C >s, then up Sn O Sen O Rm UP, and 

2. for any (v,u) <= (d,c) E€ CCP in(S,R), m,n > 1 and substitution p, if cp C 
* 


TR 


* 
m1 and d'p C Ss, then vp HPR, © SRo UP. 
tba 


Lye Am B * 
ar rR, Nand M => s, P. We show Nats, octs., Q and 


A 
Proof. Let M 


P r., Q for some Q. For the rewrite steps used in the critical pairs conditions 


above, note that ++, o Bec = >p 0 Ą A ><p as well as Š, = ity for any £. 
Let I’ and A be sets of subterm occurrences in the term M given as follows: 


Thus, I consists of subterm occurrences A;’s that is a proper subterm occurrence 
of some B; and subterm occurrences B,’s that is a subterm occurrence of some 
Ai; A consists of subterm occurrences A;’s and B,’s not contained in J’. Clearly, 
for any 1 <i < îm, either one of A; € I’ or A; € A holds, and for any 1 < j < 0, 
either one of B; € I or Bj € A holds. In the case Ay and B; are the same 
subterm occurrence, we put A; to A and By to I. 

A denotes the set of maximal redexes occurrences in the following sense. 
Let A = {M,,..., Mp}. Then we have M = C[M,,..., Mp] for some context 
C. Furthermore, we have N = C[Nj,...,Np] and P = C|P,,...,P5] for some 
Nj,. mars Np, Pi, see Pp such that Mi =R, Ni, Mics, P; (i = ly: . ,p). Thus, 
it suffices to show for each M;, there exists Q; such that N; Gs, 0 HHS, Qi 
and P; r, Qi. On the other hand, I is used to count the size of overlaps and 
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is used to give the induction weight. Let || = $> per |D|. Our proof proceeds 
on induction on lexicographic combination of (m + n, |I|}. 

The cases for m = 0 or n = O are easy, thus we consider the cases for 
m >0,n > 0. We distinguish two cases: 


1. Case M; ¢ {B,,..., Br}. Note that M; € {Aj,...,Am} and M; C B; for 
no Bj. Let {Bji,...,Bj} = {B; | 1 < j < n,B; C Mi}. Then we have 


M; = C;{Bi,..., Bl] and P; = C;,[B4,..., BY] so that M; Sorn Nj and 

Mi ee P;. We distinguish the cases. 

(a) Case M; >r,,_, Nj. Since Sr, C HR we have M; 4hR, Ni. 
Thus, the desired Q; is obtained by induction hypothesis and Lemma 3. 

(b) Case M; r, N;. Then M; = 10, N; = r0 and Rm—1 + cO for some 
l— r < cE nR and 8. If all redex occurrences B; in M; are contained 
in the substitution 0, then the desired Q; exists by Lemmas 3, 4 and 
induction hypothesis. Suppose otherwise, i.e. there exists B; which is not 
contained in 6. Let X = {B; | 1 < j < q, B} is not contained in @ } and 
Y = {Bi |1 < j < q, B} is contained in 0 }. For each B; € X, either 


m—1)? 


B. Po a 
Bi Ss,, Bi or B; Sson Bi. We distinguish two cases. 


$ 


Bi 
Case that there exists Bi € X such that B; ar By. W.l.o.g. sup- 


=. 


Bi 
pose j = d i.e. Ai € X and B| Ss, Bi. Let M; Bion M;i. Note also 


pety 


here M; n s, P;. The proof of this case is illustrated in Fig. 1. Let 
l — r’ <d € S, Bi =I and Sn- F e0’. Then, since Bj is not 
contained in 0, L —> r <4 c € R and l’ —> r’ < d € S overlap. Fur- 
thermore, as Bi C Mi, we have (v,u) < (d,c) E€ CCPin(S, R) and 
there exists a substitution 0” such that M; = v0” and N; = ud”. By 
our a ma condition (2), we obtain M; =r, Qi HOR n Ni; 
let Mm, ° a „n Qi. Let I = {C; | 3B;(j #1). Ci C Bi} U {Bi 
1, 3C}. Bie (€ Cy. Occurrences in I” are distinct, and for any B € I”, 
there exists Bi (2 < j < q) such that BC Bi. Thus, |I| < = |B;] 
holds. Hence, we obtain |I”| < X5- |B;| < X5- |B;| < |T|. Thus, 


ee E o B3,..,B3 
one can apply induction hypothesis to Q; oe Rim M; =o S, P; so 


as to obtain Qi, P, such that Qi Hs, Or SH Sen P, and P; HFR P,. 
Since we have N; HPR 2 Qi sS, Q, by applying induction hypoth- 
esis and Lemma 3, it follows that there exists N; such that N; cts, 


o Hs, Ñ; and Â! HOR om Ñi. Then, by induction hypothesis and 
Lemma 3, it follows that there exists Q; such that N; Hes, Qi and 
È; GOR om Qi. 

ii Case that B; Š sn Bi holds for any B; € X. As M; oe s, P; 
and Bi,..., Bt are parallel, we can first. rewrite all B; € Y ( 
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Bi 2 B}, -p Bi 
Mi z > M; $ -H +P 
Sn r Sn r 
Rom i Ren t Rin 
i 
Oi Ore LH. “tH 
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PEDAS i i 
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N,----H----40¢----4}---+ Ñi Tl Q 
Sn Sen Sen 
Fig. 1. Case 1.(b).i 
fe z * 
M — = M: Hnr, 
Sn c Sn-1 r 
Rm 1 Rm I Rm 
| | 
| 1 
l 
Mi Lemma 4 a LH. & Lemma 3 a 
| 
i i 
ji | 
{v + 
Ny e----H----+ Ge ---- Fh ---9 % 
Sn Saat 
Fig. 2. Case 1.(b).ii 
P . BY pai 
j< 9: Namely, let Y = {B/,..., BZ}, and we have M; c's, 


Mi; Š S,_, Pi. The proof of this case is illustrated in Fig. 2. Here, since 
each B} in contained in the substitution 0, one can use Lemma 4 to 
obtain Q such that N; => s,, Q and M; Rin Q. Now, since >r, C 


* x ~ * 
er, and Ss, C cHs,_,, we have Q dr, Mi =s, Pi 


Then, using induction hypothesis and Lemma 3, we can obtain Q; 


such that Q = Qi, Pi HR, Q;. As a side remark, we mention 
that our first key ingredient becomes necessary to solve this case. 


107 


2. Case M; € {Bi,.. 


Then one can put M; = Cil 


1 


Mi R Mi . . 
and Mi= s,„ P;. By definition, Mis, P; is either of the form M; Š s 


M; 
or Mi Sy P 


., Ba}. Let {Aj,..., Ag} = {Aj | 1 < j < n, A}; C MG}. 
z Aly Ah 

ten A Ni = Cili,- Â, Mi S Rp Ni 

ı Pi 
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Suppose M; Ssn P;. Then, we have M; es, P; and thus the desired 
Q; exists by induction hypothesis and Lemma 3. 

Thus, it remains to consider the case M; sl P;. Then there exists l’ 3 r’ = 
c € S and @ such that M; = 1'6', P; = 1/6! and d'0! C 45,_,. We distinguish 
whether all redex occurrences Aj in M; are contained in 0’ or not. If all redex 


occurrences A; in M; are contained in 6’, then using +s, C Hs, 0482, 


and =R, C Čr, one obtains desired Q; by Lemma 4. 
So, let us consider there exists A’; which is not contained in 0’. Let X’ = 
{Aj | 1 < j < ¢,A; is not contained in 6’} and Y’ = {Aj | 1 <j < 


Al 5 
q, Aj, is contained in 6’}. Then for each A} € X’, we have either A} SR, Ai, 


W FY E x 
or Aj Rem Ay. We distinguish two cases. 


(a) Case that Aj ey, At for some Aj € X’. W.l.o.g. assume j = 1, i.e. 
Ai € X’ and Aj ae A’. Then there exists l > r < c € R such that 
Ai = 10 and cô C >p,,_,. We further distinguish two cases: (a) the case 
Ai = M; andl > r & c € R are l’ — r’ 4 d € S are identical, and (8) 
the case A| # M; orl —> r & c € R and l’ > r’ & d € S are distinct. 
We remark that a construction similar to the one in [14] will be used in 
case of (a) and that our assumption that R and S are properly oriented 
and right-stable will be used here. 

i Case (a). Then we have M; = Aj a A’, = N; and M; xc P;. By 
10 = M; = 16’, x0 = x6’ for any x € V(I). We also have Rm—1 F cO 
and S,-1 F c0. Thus, if V(r) C V(J), then r0 = r6’, and it suffices 
to take rð as Q;. Suppose otherwise, i.e. V(r) V(I). Below, let 
c= 8&1 X ti,..., Sj 7X tj and Ck = S1 X ti;,...; Sk Z tk (1 < k < j). 
We now show there are substitution pp (k € {0,...,7}) satisfying the 
following properties (a)—(c) by induction. 

(a) pe = 8 =8' VCD). 
(b) dorila) EVO UNE). 
(c) for any x € V(DUV (cp), we have z0r, tpr and 2OGHs,_, 
TPk. 
If k = 0 then take po = O|y1), and (a)-(c) follow. Suppose k > 0. 
Since r contains an extra variable and R (or S) is properly oriented, 
we have V(s,) C V(I) U V(cx_-1). Thus, by induction hypothesis on 


* * 
(c), we have 5,0 =H s,„_ı SkPpk-1 and 8,0’ >R, nı SkPk-1. Further- 


n-1 
more, we have s0 Rpt tkô and sk’ Š s, tkO! by Rm—1 F c8 
and S,_1 F c6’, respectively. Hence, SkPk—1 Hie. skO Rn tô 
and tg’ HS sg’ HORo SkPk—1- Then, by applying induction 


hypothesis and Lemma 3, we obtain q’,r’ such that s,pp—1 HOR 


* * * ; 
qd os,,_, th and t,0’ GH R,,_, 1’ 4 s,,_, §kPk—-1- Thus, one obtains 


red >S,_1 SkPk—1 ee q'. Again, by applying induction hypoth- 


i 


=e 
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n ‘ * * 
esis and Lemma 3, we obtain s’ such that r’ =r sors, 7. 


m—1 
Thus, we have t0 hs, _, s’ and t0 HH Ren 1 g 

We know that tp is either a ground R,,-normal form or a linear con- 
structor term (w.r.t. R) by the right-stability of R, and that tẹ is 
either a ground S,,-normal form or a linear constructor term (w.r.t. 
S) by the right-stability of R. Suppose tk is a ground R,-normal 
form or tk is a ground S,-normal form. Then, ¢,6’ = t,0 = tk by 
V(t.) = 0, and thus, tk = s’ by t,6’ eee s’. Furthermore, as we 
are assuming V(r) Z V(l), we know V(s;) C V(I) U V(ci—1) from the 
proper-orientedness of R (or S). Thus, V(I) UV(cx) = VI) UV (cp_1). 
Hence, pk := pPx—1 satisfies (a)—(c). Suppose otherwise. Then tẹ is 
linear and is a constructor term w.r.t. both R and S. Then, by 
tk Hesa s', there exists a substitution p such that s’ = tp and 
dom(p) C V(t) such that for any x € V(tp), 70 G45 


thermore, by t0’ HR, 4 s’, there exists a substitution p’ such 
that s’ = t,p’ and dom(p’) C V(t,) such that for any x € V(tx), 
xo! HOR na xp’. Now, because tkp = s’ = tkp', we know xp = xp! 
for any x € V(t), and thus p = p’ from dom(p), dom(p’) C V(tx). 
We also have V(t) N (V(I) U V(cg_1)) = 9 by the right-stability of R 
(or S), and thus, dom(p) N dom(pz-1) = 0. Hence, pp := pr-1U p is 
a substitution, and pp satisfies (a)—(c). This completes the induction 
proof for existence of substitutions px satisfying (a)—(c) (1 < k < j). 
Now consider the substitution p;. Since R (and S) is a 3-CTRS, we 


have V(r) C V(l)UV(c;). Thus, by the condition (c), N; = rdHs,_, 
rp; and P; = ré’ PR, rp; hold. Thus, taking Q; := rp;, and we 


xp. Fur- 


n—l1 


* * 
have N; =s, Qi and P; Ge R,,_, Qi- 


Case (8). Let M; = Mi a N;. The proof of this case is 
illustrated in Fig.3 (left). Because there exists an overlap between 
l—r cE nR anadl’ —> r’ < d ES, there is substitution 0” and 
a position p € Pos-(I’) such that M; = l0” = l0" (10"]p = 10” [A}]p- 
Then, M; = l'[r]p0", Pi = 1'0", Raa F c0” and SF c0”. Then, 
there exists an CCP (u,v) = (d,d’) E€ CCP(R,S), where u = l'[r]po, 
v = r'o, d = co and d' = co for the mgu o of l'|p and J. Then, as 
('0"), = 10”, we have 6” = poo for some p. Thus, P; = r/9” = 
(r’c)p = vp, Mi = V[r]p6" = (U'[r]po)p = up, Rm—1 dp, and Sn-1 F 
d'p. Hence, by our critical pair condition (2), up Gres, © Hs, 5 
and vp R., s for some s, and thus, by taking P; := sp, we have 
Mi hs, P! Ason P; and P; GPR, P, for some Pi. 


Suppose M; “YH 5 PY. Let I’ = {Al | 30A c OC} ULG; | 


3A;.C; C Aj}. Occurrences in I” are distinct, and for any Cer, 
there exists Aj (2 < j < q) such that GC Aj. Hence, |I"| < 
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(b) 


Finally, from Lemma 2 we conclude that R and S are level-commutative. 


Fig. 3. Case 2.(a).ii (left) and Case 2.(b) (right) 


3 |Aj| < De |A’| < |I|. Thus, one can apply induction hypoth- 
esis to obtain Q; such that N; >s, 0 esn Q; and P! OR Qi. 
By applying induction hypothesis and Lemma 3 once again, we know 
that there exists Q; such that Q Hs, Qi HPR, È,. 


Al, At 
Case that A; Ry At for any A; € X’. Since M; C Rn Ni 
and Aj,.. Ab are parallel, one can Terit A; € Y' first. That is, 


Mi Annae Mi ->r,,_, Ni where Y’ = {A},..., A!}. The proof of 
this case is illustrated in Fig. 3 (right). Then, as each A? is contained in 
6’, by Lemma 4, there exists Q such that M; Sn Q and P; =R, Q. 
Furthermore, as >s, C cH s, and tee. 


Lc eR one can apply 
induction hypothesis and Lemma 3 to N; “Rr, , Mi >s, Q to obtain 


Qi such that N; cts, © $ i Sen Qi and Q He, Qi. 


m—1)? 


A level-confluence criterion is obtained by taking R = S. Note that one can 
use CCP out instead of CCP in the first condition, contrast to the commutativity 
criterion, as the second condition implies the part for CCPin(R) of it. 


Corollary 1. Let R be a left-linear, properly oriented, right-stable 8-CTRS. If 
the following conditions are satisfied, then R is level-confluent: 


1. for any (u,v) = (c,d) € CCP ou(R), m,n > 1 and substitution p, if cp C 


* 
Rin 


ts 
_, and pC Rat then up APR, O Ron OR» VP, and 


2. for any (v,u) = (d,c) E€ CCPin(R), m,n > 1 and substitution p, if cp C 
Rp and d'p C Sr, then vp APR» OP Rem UP- 
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Example 1. Let R and S be the following CTRSs: 


p(x) > q(x) p(x) — r(x) 
R= ¢ r(x) > s(p(z)) S= 4 q(x) — s(p(z)) 
s(x) > f(y) <=pl(x) ~y s(x) > f(y) plz) ey 


We have CCP(R,S) = {(q(x), r(x)) <= (0,0)} and CCPin(S,R) = 0. Note that 
the overlap of s(x) —> f(y) = p(x) ~ y E R and s(x) — f(y) = p(x) sy ES 
is not considered, as these rules are identical; the case 2.(a).i of the proof above 
treats this case. Now, because we have q(x) >s„ s(p(x)) and r(x) >r, s(p(x)) 
(n,m > 1) the condition (1) of the Theorem 1 is satisfied. Other conditions of 
the theorem are also satisfied. Thus, R and S are level-commutative. Similarly, 
one can show R U S is level-confluent. 


Example 2. Take CTRSs R = R'URş and S = S'U Rp such that 


RI = Peo ane s= eE a 
y) > (x,y) > p(z,y) =y ~b 


and Ry = {f(0) — a, f(s(x)) — b = f(x) ~ a, f(s(x)) => a = f(x) ~ b}. 
We have CCP(R,S) = { (a) : (r(x, y) alz, y)) = ({x ~ a}, {y ~ b}), (b) : 
(a,b) = ({f(x) ~ b}, {f(a) = a}), (c) : (b,a) = ({f(x) ~ a}, {f(x) = b})}, and 
CCP n(S, R) = 0. For the CCP (a), let m,n > 1 and p be any substitution, and 
suppose that p(x) >R,,_, a and AY ) g,_, b. Then, we have r(p(x), p(y)) >s, 
p(p(z),p(y)) and q(p(2), p(y) en P(p(e), ply)). Also, note that there is no 
term t such that t r b and i Žs a (or t Ör a and t s b). Thus, the 
condition (1) of the Theorem 1 holds for CCPs (a)-(c). Other conditions of the 
theorem are also satisfied. Thus, R and S are level-commutative. Similarly, one 
can show RU S is level-confluent. 


Since TRSs can be regarded as CTRSs with no conditions and they are triv- 
ially properly-oriented, right-stable, and of type 3, this theorem covers Propo- 
sition 1. However, this does not mean our theorem broaden the scope of TRSs 
that can be guaranteed to commute—because rewrite steps of TRSs are level 
1 rewrite steps in CTRSs, our condition reduces to the one of Proposition 1 in 
TRSs. Thus, when restricting to TRSs, Theorem 1 coincides Proposition 1. 

On the other hand, Corollary 1 properly extends Proposition 2, as witnessed 
by RUS in Examples 1, 2. 


4 Critical Pair Criteria for Join and Semi-Equational 
CTRSs 


In this section, we explore critical pair criteria for join and semi-equational 
CTRSs, following our approach in the previous section. 

First, let us fix additional notions and notations that will be used in this 
section. A rewrite step of join CTRS R is defined via the following TRS Rn 
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(n € N), which are inductively given as follows: Ro = 9, Rn+1 = {lo > ra | l —> 
r<=cé€R,co C +R, o Sr, }. For semi-equational CTRS R, we modify the 
second clause as: Rn4i = {la > ro | l>r 4 cE R,co C SR, }. Similarly to 
the oriented case, a rewrite step s >r t of R is given as s >r t iff s >r, t for 
some n, and the smallest n such that s >r, t is called the level of the rewrite 
step s >r t. We write |z, (Lr) for the relation Är, o Čr, (resp. Žr o Čr). 

In this section (except Subsect. 4.2), in order to distinguish three types of 
CTRSs, we write R° for an oriented CTRS, Ri for a join CTRS, and R5 for a 
semi-equational CTRS. Similarly, notations R°, Ri,,... are employed. Notations 
R° F co (RÌ, F co, RS, F co) stands for co C Re (resp. co C lri» co C Ors). 

The following basic relations between rewrite relation on three types of 
CTRSs on each level are essentially proved in [18, Lemmas 1 and 2]. 


Lemma 5. Let R be a CTRS. Then >rs C >pi CRs for each n. 


Notions of orthogonality, proper-orientedness and right-stability are syntax- 
oriented, and their definitions remain same for other types of CTRSs. Note that 
even under the conditions of proper-orientedness and right-stability, >r = 
Ri, does not hold in general. 


4.1 Level-Confluence of Join and Semi-Equational 3-CTRSs 


In [14, Corollary 5.3], Proposition 2 is applied to show the corresponding class 
of join CTRSs are level-confluent: 


Proposition 3 ({14]). Let R be an orthogonal, properly oriented, right-stable 
3-CTRS. Then RÌ is level-confluent. 


Given our Theorem 1, a natural question is whether a similar extension is 
possible for our theorem. In this subsection, we give a partially positive answer 
to this question—we generalize the result above to the level-confluence part 
(Corollary 1) of our theorem, even though a similar extension does not work for 
level-commutation. Indeed, we show that above proposition can be extended to a 
more general setting of CTRSs where the orthogonality requirement is replaced 
with level-confluence of R°. Furthermore, the generalization is obtained not only 
for join CTRSs but also for semi-equational CTRSs. 

The next two lemmas are abstractions of the ones [14, Lemmas 5.1 and 5.2], 
where the proofs remain almost the same. 


Lemma 6. Let R be a properly oriented, right-stable 3-CTRS such that R°? is 
level-confluent. Let l +r = sı %t1,...,8; X tj E R. If sio |re_, tio for any 
1<i<j then lo Lre ro. 


Lemma 7. Let R be a properly oriented, right-stable 3-CTRS such that R? is 
level-confluent. If s >Rs t then s |Ro t. 


Now we present the claimed result: 
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Theorem 2. Let R be a properly oriented, right-stable 3-CTRS. If R° is level- 
confluent then RÌ and R5 are level-confluent. 


Proof. Let R be a properly oriented, right-stable 3-CTRS such that R° is level- 
confluent. Suppose tı “pi s pi tə (ty “ARs s Rs t2). Then tı ERs 8 Rs 
t2 by Lemma 5. Thus, by Lemma 7, tı Ore t2. Hence, tı Lre t2 follows by 
the level-confluence of R°. Using again Lemma 5, this implies tı |i t2 (resp. 
ti Irs t2). 


Thus, Corollary 1 can be applied to show the level-confluence of join and 
semi-equational CTRSs. Note here that the conditions of Corollary 1 is stated 
in terms of >% not in that of >} or >. 


4.2 Commutation of Semi-Equational 3-CTRSs 


A most fundamental ingredient of the proof presented (inherited from [14]) is to 
use induction on the level of rewrite relation. It seems, however, applying this 
approach for join and semi-equational CTRSs contains fundamental difficulty. 
Without the induction on the level, what can we do within the parallel-closed 
approach? In this subsection, we will exhibit one alternative approach for semi- 
equational CTRSs. 

In [1], it is reported that left-linear parallel-closed semi-equational 1-CTRSs 
are confluent. By examining its proof detail, we can extend it to commutativity 
of 3-CTRSs as follows. Below, notation R F cø (etc.) stands for co C SR. 


Theorem 3. Let R,S be semi-equational left-linear 3-CTRSs. Suppose the fol- 
lowing conditions are satisfied: 


1. for any (u,v) = (c,¢) E€ CCP(R,S) and any substitution p, if R + co and 

St do, then up Hs 0 —R vp, and 

2. for any (v,u) = (c,c) E€ CCPin(S,R) and any substitution p, if RF cp and 
SF cp, then vp +R up. 


Furthermore, assume +s C Ps R; Rr C &s and RAS is a 2-CTRS. Then, 
R and S commute. 


We remark that conditions =s C Or and +R C Čs are used to close 
nested peaks, and that the condition that R N S is a 2-CTRS is required to 
resolve for peaks obtained by the same rule. 


Example 3. Let R and S be the following left-linear semi-equational 3-CTRSs: 


R = {q(2,y) > ply, £), p(z,y) > qalx’, y’) Hex a’, yry'} 
S = {p(z,y) > q(y, £), a(z,y) > ple y) = x£ ~ x,y ~ y'} 


By induction on the level n, one can show >s, C Čr, and >r, C Šs,. 


Thus, conditions =s C Op are =r C Os are satisfied. Clearly, RNS = 0 
is a 2-CTRS. We have CCP(R,S) = {(q(2’,y’),q(y,2)) = {a x ay & 
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y'},0),{(P(y, 2), ply) = Mdr = ay ~ y’})} and CCPin(S,R) = 
0. Clearly, p(x) Čr p(x’) and ply) Sr p(y’) imply plr) ply’) >r 
alely), p(x), and p(x) Ss p(x’) and ply) Ss p(y’) imply q((x), p(y)) =s 
p(p(y), p(x)). Thus, all conditions of the Theorem 3 are satisfied. Thus, R and 
S commute. 


Note the conditions +45 C Spr and =r C Ss of Theorem 3 imply 
Ör = Čs, i.e. R and S have the same underlying logic. 


5 Conclusion 


We have given a critical pair criterion for ensuring level-commutativity of left- 
linear properly-oriented right-stable oriented 3-CTRSs. Our result generalizes a 
sufficient criterion for commutativity of left-linear TRSs of Toyama [16]. It also 
properly extends level-confluence of orthogonal properly-oriented right-stable 
oriented 3-CTRSs of Suzuki et al. [14]. We then have showed this result can be 
applied to obtain a criterion for level-confluence of left-linear properly-oriented 
right-stable join and semi-equational 3-CTRSs, generalizing a result of [14]. We 
have also explored a similar but different approach of Aoto and Toyama [1] to 
obtain a criterion for the commutation of semi-equational 3-CTRSs. 

Wirth [17] also gave a criterion of level-confluence for possibly non-orthogonal 
CTRSs that generalizes a sufficient criterion for confluence of left-linear TRSs 
of [16]. He adapted the approach of [16] for a framework of join CTRSs. It also 
incorporates some ideas of [14] so as to give the notions of (weak-)quasi-normal 
CTRSs, etc. A critical key difference with the usual conditional rewriting such 
as employed in our paper, however, is that the validity of conditions needs to 
be satisfied under a kind of constructor discipline. This restriction considerably 
simplifies proof arguments dealing with conditional parts, paying the penalty 
of going apart from the standard framework. On the other hand, despite these 
sharp differences on the underlying frameworks of ours and [17], interestingly, 
the critical pair criterion of Theorem 3 and Wirth’s critical pair criterion (|17, 
Definition 28]) resemble very much. 

Over various formalisms of rewriting, considerable efforts have been spent on 
automating confluence checks in recent years. Yearly competition? of confluence 
tools started in 2012; the category of CTRS has been also introduced in 2014. In 
recent competitions, confluence of oriented 8-CTRSs, which our main theorem 
deal with, has been focused in the category of CTRS. Known confluence tools for 
CTRSs include CONFident [6], ConCon [13], CO3 [9] and ACP [2]. We note here 
that all these tools fail to show confluence of RUS of Example 23. Among these 
tools (at least) ConCon and ACP incorporate checking of confluence criterion of 
[14]. We have been working on the automation of our results, but it is yet under 


2 http://project-coco.uibk.ac.at /. 
3 Experimented for CoCo 2022 participants ACP, CO3, CONFident and a CoCo 2020 
participant ConCon, via CoCoWeb [7]. 
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development. Recent advances in confluence tools for CTRSs include automa- 
tion of infeasibility checking [5|—we believe some approaches for automation of 
infeasibility checking can be adapted for automation of our criterion. 

Formalization by interactive theorem provers such as Isabelle/HOL, Coq, 
PVS4, etc. have been of great interest in recent years. Formalization is also 
indispensable for certification of results obtained by confluence tools. Regarding 
for results of [14], a formalization in Isabelle/HOL has been reported by Ster- 
nagel and Sternagel [12]. On the other hand, formalization of our results remains 
completely as a future work. 


Acknowledgements. Thanks are due to the anonymous reviewers (including those 
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supported by JSPS KAKENHI No. 21K11750. 
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Abstract. Byzantine fault-tolerant distributed systems are designed to 
provide resiliency despite arbitrary faults, i.e., even in the presence of 
agents who do not follow the common protocol and/or despite compro- 
mised communication. It is, therefore, common to focus on the perspec- 
tive of correct agents, to the point that the epistemic state of byzantine 
agents is completely ignored. Since this view relies on the assumption 
that faulty agents may behave arbitrarily adversarially, it is overly con- 
servative in many cases. In blockchain settings, for example, dishonest 
players are usually not malicious, but rather selfish, and thus just fol- 
low some “hidden” protocol that is different from the protocol of the 
honest players. Similarly, in high-availability large-scale distributed sys- 
tems, software updates cannot be globally instantaneous, but are rather 
performed node-by-node. Consequently, updated and non-updated nodes 
may simultaneously be involved in a protocol for solving a distributed 
task like consensus or transaction commit. Clearly, the usual assumption 
of common knowledge of the protocol is inappropriate in such a setting. 
On the other hand, joint protocol execution and, sometimes, even basic 
communication becomes problematic without this assumption: How are 
agents supposed to interpret each other’s messages without knowing their 
mutual communication protocols? We propose a novel epistemic modal- 
ity creed for epistemic reasoning in heterogeneous distributed systems 
with agents that are uncertain of the actual communication protocol 
used by their peers. We show that the resulting logic is quite closely 
related to modal logic S5, the standard logic of epistemic reasoning in 
distributed systems. We demonstrate the utility of our approach by sev- 
eral examples. 


1 Introduction 


A distributed system is a system with multiple processes, or agents, located 
on different machines that communicate and coordinate actions, via message 
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passing or shared memory, in order to accomplish some task [8,21]. This common 
task is achieved by means of agent protocols instructing agents how to exchange 
information and act. Designing distributed systems is difficult due to the inherent 
uncertainty agents have about the global state of the system, caused, e.g., by 
different computation speeds and message delays. 

Knowledge [15] is a powerful conceptual way of reasoning about this uncer- 
tainty [13,14]. Indeed, knowledge is at the core of the agents’ ability to act 
according to the protocol: According to the Knowledge of Preconditions prin- 
ciple [22], a protocol instruction to act based on a precondition y can only be 
followed if the agent knows y to hold. While trivial for preconditions based on 
the local state of the acting agent itself, this observation comes to the fore for 
global preconditions, also involving other agents, as is common for coordination 
problems such as consensus. 

One of the standard ways of modeling agents’ knowledge is via the possible 
world semantics that takes into account all the possible global states the agents 
can be in and which of these possible worlds a particular agent can distinguish 
based on its local information. In this view, agent 7 knows a proposition y, 
written K,;y, in a global state s iff this proposition holds in all global states s 
that are indistinguishable from s for i. The primary means of obtaining new 
knowledge — and the only way of increasing knowledge about the local states 
of other agents — in a distributed system is by means of communication. 

Fault-tolerant systems add another layer of complexity, in particular, when 
processes may not only stop operating or drop messages but can be (or become) 
byzantine [19], i.e., may behave arbitrarily erroneously, in particular, can com- 
municate in erratic, arbitrary, or deceptive manner. Malicious faulty agents may 
have a “hidden agenda”, in which case, instead of following the original com- 
monly known protocol, a faulty agent (or a group of faulty agents) can execute 
actions (possibly in consort with each other) that jeopardize the original goals 
of the system. 

Although these hidden agendas are typically not transparent for correct 
agents, some assumptions must be made to restrict the types and numbers of 
protocol-defying actions and messages. Without such restrictions, provably cor- 
rect solutions for a distributed task do not exist. These assumptions must usually 
be commonly known by all agents, like the basic communication mechanism, the 
protocol of all correct agents, the data encoding used in its messages, etc. In [7], 
the whole corpus of these common assumptions is referred to as a priori knowl- 
edge.' For the possible world semantics, this translates into the assumption of 
common knowledge of the model [3], which enables the agents to compute epis- 
temic states of other agents, a task necessary for a typical coordination problem 
like consensus [6]. 

Since correct agents generally cannot distinguish a simple malfunction from 
malintent, erroneous messages, i.e., messages sent in contravention of the com- 


1 The focus of [7] is on a priori assumptions that can be erroneous and may require 
later updates, hence, the term a priori beliefs there. In this paper, we generally 
assume these assumptions to be factive, hence, we use a priori knowledge instead. 
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monly known joint protocol, are usually left uninterpreted. For instance, in the 
epistemic modeling and analysis framework [11, 16-18] for byzantine agents, mes- 
sage y received from agent i is interpreted by means of the hope modality 


Hyp := correct; > Bio, 


where Biọ represents belief of agent 7 and is understood in the spirit of belief as 
defeasible knowledge [24], where 


Big := K;(correct; > g). 
This hope modality H;y is equivalent to a disjunction 
acorrect; V (correct; \ Bip), 


suggesting that a message y from 7 is interpreted as the uncertainty between 
agent i being faulty or the epistemic state of i confirming ọ in case 7 is a correct 
agent. Note that in the former case, the message carries no meaning whatsoever. 
Indeed, the axiomatization of hope in [10] takes H;_L to be the definition of faulty 
agents because only a faulty agent can send contradictory messages. Given that 
in normal modal logic H;L — H; holds for any y, the consequence is that a 
faulty agent can send any message independent of its epistemic state. In other 
words, no conclusions about the epistemic state of a faulty agent can be drawn 
from its messages, as reflected in the hope modality. 

However, not all systems exhibit such a stark dichotomy between commonly 
known and fully transparent us (correct processes) and the mysterious and unin- 
terpretable them (faulty processes). Rational agents in blockchain settings [12], 
for instance, do not necessarily have the same goal as the rest of the system. 
Nevertheless, neither their actions nor their communication are arbitrary, not to 
speak of adversarial. Consequently, game theoretic modeling, based on a model 
of their beliefs and goals, can be applied for the analysis of such systems [2]. 

In this paper, we extend this finer-grained view to the epistemic modeling of 
distributed systems and consider heterogeneous distributed systems, where dif- 
ferent processes may run different protocols and where the assumption that all 
protocols are commonly known is dropped. In such systems, we assume that pro- 
cesses are partitioned into types (or roles, or classes) of agents, so that within 
one type the protocols are commonly known to the agents of that type. While 
such a strong assumption is not made for agents of different types, we do not 
assume them to have zero knowledge of each other’s protocol either. In particu- 
lar, we assume that each class is equipped with an interpretation function that 
encodes the amount of knowledge agents have regarding the preconditions for 
communication agents of a different type have. 

Since having no preconditions for sending a message is an allowed instance, 
this setting generalizes the byzantine setting described earlier, where there are 
two types — correct and faulty agents — and only messages of correct agents 
have a non-trivial interpretation. These interpretation functions are formalized 
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by means of the new creed modality cA oy introduced in this paper, which gen- 
eralizes the hope modality for the byzantine case and represents the information 
an agent of type A can infer upon receiving message ọ from agent p of type B. 

We illustrate the communication scenarios where this creed modality may be 
useful by means of some examples: 


Example 1 (“The Murders in the Rue Morgue”). This famous story by Edgar 
Allan Poe describes a murder mystery. Several witnesses heard the murderer 
(agent m) but nobody saw m. The problem in interpreting their testimony 
is that they seem to contradict each other: for instance, a French witness f 
thinks m spoke Italian and is certain m was not French, whereas a Dutch wit- 
ness d thought m was French, etc. Importantly, none of the witnesses could 
understand what was being said (f does not speak Italian, while d does not 
speak French, etc.). The standard byzantine framework considers the possibil- 
ity of a faulty agent sending different messages to different agents to confuse 
them, but provides no means to describe one uncorrupted message being treated 
so differently by correct agents. Standard epistemic methods either accept all 
incoming information as being of equal value or make a priori preferential judge- 
ments. However, in the story, Monsieur C. Auguste Dupin correctly surmises 
that m spoke neither of the languages. Dupin neither dismisses witness accounts 
completely as lies nor accepts them completely. Instead he chooses some of the 
witness statements over others without prejudging them. 


Example 2 (Knights and Knaves puzzles). There is a series of logical puzzles, 
popularized by Smullyan [26], about an island, all inhabitants of which are either 
knights who always tell the truth or knaves who always lie. One of the simplest 
ones [26, Puzzle 28] is as follows: 


There are only two people, p and q, each of whom is either a knight or a 
knave. p makes the following statement: “At least one of us is a knave.” 
What are p and q? 


Our goal is to incorporate the uncertainty about the mode of communication 
(knaves lie/knights tell the truth) into the logic. Fault-tolerant systems do not 
provide a satisfactory model since there information from faulty agents is either 
accepted (in case of benign faults) or ignored as completely unreliable (in case of 
byzantine faults). Instead, enough information is collected from correct agents 
(and they must constitute an overwhelming majority for most problems to be 
solvable). By contrast, knights and knaves puzzles are typically solvable even if 
all agents involved are knaves. The answer to the puzzle above, for instance, is 
that p is a knight and q is a knave. We would like to derive this answer fully 
within the logic. 


Example 8 (Software Updates). In a highly available large scale distributed sys- 
tem like an ATM network, it is impossible to simultaneously update the software 
executed by the processes. Rather, processes are usually updated more or less 
sequentially during normal operation of the system, at unpredictable times. As 
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a consequence, the joint protocol executed in the system while a software update 
is in progress might mix both old and new protocol instances. Existing solutions 
like [1,25], which aim at updating complex protocols/software, typically provide 
“consistent update” environments that prevent such mixing. 

Thanks to our creed modality, however, mixed joint protocols could be 
allowed, by explicitly considering those in the development of the new protocol 
instance: Indeed, when implementing a bug fix or feature update, the developer 
obviously knows the previous implementation. A message received at some pro- 
cess p from some process q in the new implementation just needs to be interpreted 
differently, depending on whether q runs an old or a new protocol instance. Note 
that backward compatibility typically rules out incorporating a version number 
into the messages of the (new) protocol here, in which case p would be uncertain 
about the actual status of q, despite having received a message from it. 

For light-weight low-level protocols, this approach might indeed constitute 
an attractive alternative to complex consistent update mechanisms. 


After introducing our framework, we explain in Sect.6 how these examples 
could be formalized. 


Related Work. Our logical framework generalizes the hope modality [10] intro- 
duced to reason about byzantine agents in distributed systems. We extend the 
standard formulation by considering the byzantine case as a special agent-type. 
Agent-types in the field of epistemic logic are formulated in [5], where names are 
used as abstract roles for groups of agents, depending on their characteristics. 
From the dynamic epistemic logic [9] perspective, a public announcement logic 
with agent types is presented in [20], providing a dynamic framework to rea- 
son about uncertainty of agent-types that is used to formalize the knights and 
knaves puzzle. Due to the different motivations, while treating a closely related 
problems set, [5] and [20] make different and at times incomparable choices 
regarding the postulates underlying the systems. For instance, a precondition 
for an announcement for an agent in [20] need not entail the agent knowing 
this precondition, which contradicts the fundamental Knowledge of Precondi- 
tions principle for distributed systems [22]. On the other hand, all agents in [20] 
possess the same knowledge about each of the existing agent types, in particular, 
all agents share one common interpretation of messages from a particular type, 
an assumption in line with the rather centralized nature of updates in dynamic 
epistemic logic but less sensible for distributed systems. 


Paper Organization. In Sect.2, we introduce the basic preliminary definitions 
and lemmas for describing heterogeneous distributed system where agents are 
grouped into types, each characterized by a different protocol. In Sect.3, we 
provide an epistemic logic for representing heterogeneous distributed settings 
by introducing the creed modality and prove soundness and completeness in 
Sect. 4. We derive the properties of creed in Sect. 5. Having done that, in Sect. 6 
we show how to apply this framework to the motivating examples. Finally, some 
conclusions are provided in Sect. 7. 
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2 Heterogeneous Distributed Systems 


In this paper, we focus on heterogeneous distributed systems where agents are of 
different types characterized by different protocols. All agents are assumed to be 
at most benign faulty,” in the sense that they do not take actions not specified by 
their protocol, cannot communicate wrong information, and have perfect recall. 
At any time, however, agents may change their type, i.e., change their protocol. 

These different protocols partition the set of processes into different types, 
which are identified with the names of the protocols. The set of all existing 
types is commonly known to all the agents. All agents of the same type, which 
typically work towards the same goal, use the same protocol that is commonly 
known to all agents of this type. What is not generally known to an agent is the 
distribution of agents into types and the actual protocol of a type different from 
its own. In other words, agent a generally does neither know the type nor the 
protocol of agent b. 

Communication in the system is governed by the protocols. Whereas all 
protocols must use the same basic communication mechanism and a common 
layering structure [23], i.e., (possibly non-synchronous) communication rounds, 
agents of different types generally communicate according to different protocol 
rules, data formats, encodings, etc. Communication actions are triggered by pre- 
conditions that depend on the protocol of the agent’s type. Consequently, the 
interpretation of each message depends on: 


— the knowledge of the receiver about the type(s) the sender may belong to; 
— the knowledge of the receiver about the communication protocol of this (these) 


type(s). 


More formally, we consider a finite set of processes IT = {p1,..., Pn} that 
communicate with each other by using a joint communication mechanism, such 
as, e.g., shared memory objects or point-to-point messages. Each process exe- 
cutes some protocol with a name (= type) taken from a commonly known set of 
names A. However, no assumption is made about the types and the actual proto- 
cols of distinct agents 7 and j being identical or mutually known. All protocols are 
organized in a common, possibly non-synchronous communication round struc- 
ture. We also require that the system has a common notion of time, represented 
by a directed set T. Common choices for T are the set of natural number N, 
or even the set of real numbers R. It should be noted that in Definitions 4-5, 
we assume that concepts such as configuration and protocol match the standard 
notions in distributed computing literature [4,21]. 


Definition 4 (Heterogeneous distributed system). We say that a tuple 
(II, A,P,C,T) is a heterogeneous distributed system iff 


- IT = {p1,..., pn} is a finite set of processes; 
- A={Aj,...,Ax} is a partition of IT into agent types; 


? Adding byzantine faults to the picture will be left for future research. 
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- P = {Pi,...,Pxr} is a collection of protocols that correspond to A, one protocol 
per agent type; 

- C is a communication medium; and 

— T is a directed set representing global times. 


The joint protocol of (IT, A,P,C,T) is the protocol formed by the protocols of all 
the agents. 


In this setting, given multiple possibly non-cooperating teams of agents, we 
need to re-define the notion of tasks and solvability. In particular, we generally 
cannot impose restrictions on the output of processes in other partitions. 


Definition 5 (Partial task). We say that a tuple (S,Z,O, A) is a partial task 
relative to S C H iffT is a set of input configurations for II; O is a set of output 
configurations for S; and A is a validity correspondence that maps valid initial 
configurations of the system to a subset of valid output configurations for S. 


Definition 6 (Solvability). Let (I7,A,P,C,T) be a heterogeneous distributed 
system. We say that agents of type A; E A can solve a partial task T = 
(S,Z,O, A) iff for any input configuration o € T, the execution of the joint 
protocol of (II, A,P,C,T) leads to an output configuration p|s € A(o). 


Note that traditional distributed systems with benign failures fall into the 
particular case where A = {JI} and there is one unique protocol executed by 
all processes. Similarly, distributed systems with send-restricted byzantine faults 
(no false perceptions of received messages, but arbitrary message sending) could 
be modeled as an instance with two types A? = {Correct, Faulty}, where all 
agents of type Correct follow the intended protocol, whereas agents of type Faulty 
can arbitrarily deviate from it. 


3 Epistemic Logic for Heterogeneous Distributed 
Systems 


We consider a heterogeneous distributed system (H, A, P,C, T} according to Def- 
inition 4, where processes are partitioned into different types according to their 
protocol. Agents of the same type share a common protocol, which also includes 
information on how to interpret messages from agents of various types. Recall 
that we assume that each process knows its own protocol/type, and, therefore, 
the protocol of all other agents of the same type, but not necessarily which agents 
are of this type. In particular, an agent may be unsure whether another agent 
belongs to its own type or not. 
Agents interpret received messages by means of an interpretation function: 


Definition 7 (Interpretation function). Let F be the set of well-defined 
formulas used by agents to communicate. An interpretation function for type A € 
A with respect to type E € A messages is any function far: F > F. 
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Intuitively, faz(y) corresponds to the knowledge that type A agents (or simply 
A agents) have about the preconditions for E agents to send message y. We 
assume that function fanz, for every type E, is a priori known by every A agent, 
as part of its protocol. 


Example 8. Interpretation function fag(y) := T for all p € F corresponds 
to the case when A agents have no knowledge about the communication pro- 
tocol of E agents. For instance, byzantine agents who can send any message 
at any time (send-unrestricted byzantine agents) can be captured by choosing 
Feorrect,Faulty(Y) = T for partition AE = {Correct, Faulty}. The minimal require- 
ment that all correct agents tell the truth translates into fcorrect,Correct(Y) = Y. 


Since we want to be able to express partition membership into our language 
and formulas, we need to define partition membership atoms. 


Definition 9 (Propositional variables and partition atoms). We con- 
sider, for each process p; € II, a finite set Prop, of propositional variables. 
In addition, for each agent type A € A, we consider the set II, := {Ap | p € IT} 
of partition atoms. The set of all atomic propositions is defined as 


n 
Prop := U Prop, U U Ia. 
i=1 AEA 


Since A is a partition, every agent belongs to one and only one type. For 
convenience, we denote the type of agent p by p. Furthermore, we will assume 
that each agent knows its own type, i.e., Kp(pp). 

Now that we have established the basics of our heterogeneous distributed 
systems, we can proceed to define the language. 


Definition 10 (Language of EHL). The language L of the epistemic hetero- 
geneous logic extends the standard (multi-modal) epistemic language by a new 
family of modalities called creed and is given by the grammar: 


p:=r |ne | (^p) | Koy, (1) 


where r € Prop is an atomic proposition (i.e., propositional variable or partition 
atom), p € II is an agent, and A, E € A are agent types. Other boolean connec- 
tives, as well as boolean constants T and L, are defined in the usual way. We 
use the following derived modalities: Kyy := =K pny and creed defined as 


Cop := Ep > Ky fan(y) (2) 
for any agent p € IT and agent types A, E € A. 


Creed c3 \Fo represents the amount of information an A agent can extract 
from a message y received from agent p under the assumption that p belongs 
to type E of the partition. It is based on the a priori knowledge A agents pos- 
sess of the preconditions for an E agent to send message y, as encoded in the 
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interpretation function fag from Definition 7, which is external to the language. 
This precondition already takes into account the Knowledge of Preconditions 
principle [22], by assuming that the sender must know that the preconditions 
hold. We use the standard Kripke model semantics with additional restrictions 
for partition atoms: 


Definition 11 (Semantics). Let (II, A,P,C,T) be a heterogeneous dis- 
tributed system and {fan | A,E € A} be the collection of inter- 
pretation functions for it. An (epistemic) Kripke frame F = (W,~) 
is a pair of a non-empty set W of worlds (or states) and a function 
~: H — P(W x W) that assigns to each agent p € II an equiva- 
lence relation ~pC W x W on W. A Kripke model M = (W,~,V) 
is a triple where (W,~) is an epistemic Kripke frame and V: W — P(Prop) 


is a valuation function for atomic propositions. The truth relation = between 
Kripke models and formulas is defined as follows: M,s = r iff r € V(s) for 
any r E€ Prop; cases for the boolean connectives are standard; M,s = Kpọ iff 


M,t ¢ for allt © W such that s ~p t. As usual, validity in a model, denoted 
M Eo, means M,s = ọ for alls E W. 

A Kripke model M = (W,~,V) is called an EHL model iff the following two 
conditions hold: 


1. For any state s € W and any agent p € II, 
V(s) {4p | A € A} = 1, (3) 


i.e., exactly one of partition atoms Ap involving agent p is true at state s. 
2. For any agent p, any agent type A, and pair of states s and t, 


S ~pt => (4, EV(s) & ADE v(t), (4) 


i.e., p can distinguish worlds where it is of different types. 


General validity, denoted = p, means M = ọ for all EHL models. 


Example 12 For the interpretation functions from Example 8 for send-unrestrict- 
ed byzantine agents, Cree = Faulty, —> K,T. For epistemic models, it 
is logically equivalent to T, meaning that no information can be gleaned from a 
message under the assumption that it is sent by a fully byzantine agent without 


perception flaws. At the same time, for truth-telling correct agents 
Cy = Correct, = Kpọ, 
which closely matches the hope modality 
Hyp = Correct, — K,(Correct, > p) 


from [10]. Indeed, since we assume agents to know their own type, it is the case 
Correct \ Correct 


that Correct, — K,Correct, holds, making H,y equivalent to Cp 
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Example 13 Apart from helping to understand messages, an interpretation func- 
tion can be used to gain knowledge about the type of the sender. For instance, 
if A agents know enough about the way E agents communicate to conclude that 
a particular message y can never be sent by an E agent, which corresponds to 
far(y) = L, then cio = E, — K,1. For epistemic models, such ci» is 
logically equivalent to —H,. In other words, having received y from agent q, an 
A agent p learns at least K,—Eg. 


Remark 14 (Information from message passing). Let p,q € I be agents and 
A be a partition of II. The knowledge gained by agent p upon receiving a mes- 
sage y from agent q can be described by KpChip, where 


Chg := VAN CPE yp (5) 
ECA 


In other words, knowing its own type, p considers all possible types for the 
sender q and for each type considers the respective interpretation of the message; 
the conjunction combined with the implications within creed make sure that the 
appropriate type is chosen. Note that the presence of send-unrestricted agents 
from Example 12 adds a conjunct to (5) that is equivalent to T. Hence, send- 
unrestricted agents can be safely ignored in determining the message meaning. 
By the same token, some conjuncts in (5) can rule out a particular type for 
agent q as in Example 13. Finally, if p has already ruled out some type E, then 
K,7£, logically implies K,(E, > Kqfar(y)) independent of the interpretation 
function. In this case, the E-conjunct of (5) becomes redundant. 


Example 15. In the system from Example8 with send-unrestricted byzantine 
agents, upon receiving message y from agent q, agent p can ignore the possibility 
of the sender being Faulty and conclude Correct, — Ky, i.e., hope Hyy for the 
case of factive beliefs, in full accordance with [10]. Note also that p may infer 
Kọ from this message if p is sure that q is correct. 


Now that we have established the basic definitions and semantics for the 
logic, we will now provide an axiomatization that we prove sound and complete 
in the next section. 


Definition 16 (Logic EHL). Let (17,A,P,C,T) be a heterogeneous distributed 
system and {far | A, E € A} be the collection of interpretation functions for it. 
Logic EHL is obtained by adding to the standard axiomatization of modal logic 
of knowledge S5 the partition axioms P1-P3. The resulting axiom system is as 
follows: for all p€ IT, all A € A, and all E € A such that E # A, 


Taut All propositional tautologies in the language of EHL; 
k K,(¢ Y) (Kpy Kp); 4 Kyp > Ky Kpy; 
t Kp > 9; 5 =K > Ky,7Kpy; 
(MP) rule inferring y from p > w and ọ; (Nec) rule inferring Kpy from 9; 
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P1 V A); P2 Ap > ~E; P3 Ap > KpAp. (6) 
AEA 
Partition axiom P1 states that each agent belongs to at least one of the types. 
Partition axiom P2 postulates that each agent belongs to at most one of the 
types. Together they imply that agent types partition the set of agent. Partition 
axiom P3 expresses that every process knows its own type. 


4 Soundness and Completeness of EHL 


Since EHL is an extension of S5 with partition axioms governing the behavior 
of partition atoms while EHL models are instances of epistemic models, the 
soundness and completeness for EHL follows the standard proof for S5 (see, 
e.g., [9]), where additionally it is necessary to establish that the partition axioms 
are sound and that the canonical model satisfies the additional restrictions. 


Theorem 17 (Soundness). Logic EHL is sound with respect to EHL models, 
i.e., EHLF vy implies — ọ. 


Proof. We only establish the validity of partition axioms. Axioms P1 and P2 
hold due to condition (3). Similarly, P3 holds because of (4). 


Completeness is proved by the standard canonical model construction, which 
requires several definitions. We omit the proofs of the following lemmas if com- 
pletely standard and only treat new cases otherwise. 


Definition 18 (Maximal consistent sets). A set I C F of formulas is called 
consistent iff EHL ¥ aA Io for any finite subset Ip C I. A set T is called 
maximal consistent iff I is consistent but no proper superset A D T is consistent. 


Lemma 19 (Lindenbaum Lemma). Any consistent set I can be extended 
to a maximal consistent set ADT. 


Definition 20 (Canonical model). We define the canonical model M? = 
(SC, ~F, VO) is defined as follows: 


- S© is the collection of all maximal consistent sets; 

-I ~p A iff {Kpy | Kpy E T} = {Kpy | Kpy € A}; 

- VO(L) := {r € Prop |r e Ty}. 

Lemma 21 (Truth Lemma). For any y €F and any I € SF, 


pEr < MET Hg 


Lemma 22 (Correctness). The canonical model is an EHL model. 


Proof. That S] # Ø and ~, is an equivalence relation for each p € IT is proved 
the same way as for S5. It remains to show that (3) and (4) hold. 
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(3) Consider any maximal consistent set l’ € S° and any agent p € II. By the 
standard properties of maximal consistent sets, all theorems of EHL belong 
to each maximal consistent set, in particular, (V4.4 Ap) € T because of 
axiom P1. A disjunction belongs to a maximal consistent set iff one of the 
disjuncts does. Hence, there exists at least one type A such that Ap E I. 
At the same time, for any other type E, we have (Ap — —E,) € I’ because 
of axiom P2. Hence, Ep ¢ I’ because maximal consistent sets are consistent 
and closed with respect to (MP). It follows that there is exactly one partition 
atom of the form A, in I’. Hence, by the definition of V°, 


IVIT) nm {Ap | A€ A}| = 1. 


as 
A 
SS 


Consider two maximal consistent sets l’ ~p A. Let Ap € I. By P3, also 
K,pAp € T. Hence, KpAp € A by the definition of ~p. Finally, Ap E€ A by 
axiom t. We proved that A, € I implies A, € A. The inverse implication 
is analogous. 


Theorem 23 (Completeness). Logic EHL is complete with respect to EHL 
models, i.e., EHL F y whenever = y. 


Proof. We prove the contrapositive. Assume EHL ¥ y. That means that {~y} is 
consistent. By Lindenbaum Lemma 19, there exists a maximal consistent set I’ D 
{ay}. Hence, this  € SÙ for the canonical model M© defined in Definition 20, 
which is an EHL model by Lemma 22. By the Truth Lemma 21, it follows that 
M°,IT H ~y. Since MO, T |K for some EHL model, ¢ is not valid, i.e., vy. 


5 Properties of Creed 


In this section, we derive several useful properties of creed modalities. 

The explicit assumption P3 that each agent knows which type it belongs to 
implies a complete knowledge of own type, i.e., each agent a knows whether it 
belongs to any type A: 


Theorem 24. EHL F =A, > K,7Ap, for all p € IT, A € A, i.e., agents know 
which type they do not belong to. 


Proof. By P1, agent p must belong to one of the types. Hence, if not type A, it 
must be one of the remaining types, i.e., "Ap — Veza Ep. Therefore, we have 
~A, > V r44 KpEp due to P3. Given that Ep > ~A, for each E # A by P2, 
also KpEp — Kp™Ap for each E # A by standard modal reasoning. Hence, 
Ap > Kya Ap. 


Corollary 25. EHL F- K Ap V K,7A, for allpe H, ACA 
Proof. It follows directly from P3 and Theorem 24 by propositional reasoning. 


The creed modality amounts to K45-belief: 
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Theorem 26. Creed satisfies the normality, positive and negative introspection 
axioms if applied to statements already translated by an interpretation function. 
Formally, let [yp] 4, stand for any formula € such that farn(€) = p. Then the 
following formulas are derivable in EHL: 


ke 
4c F 


5c m =C 


T ~ =o Plas = (c Co" [lar Te Ce" Wag) 


Ilan > CAP [Ce bolas] ,. 
Pilas > cf chleas], 


Proof. We start by deriving kc: 


1. Ce" [vp > Wap = Ep > Kple > Y) definition of creed 
2. K (p — Y) — (Kpy > Kpt) axiom k 
3. C cale ylar (Ep > (Kp K,v)) prop. reasoning from 1.,2. 
4. C \F Ty] AB = Ep > Kpọ definition of creed 
5. cA jo > plag > (c cf iklag > (Ep > Kp) prop. reasoning from 3.,4. 
6. Ep > KY = Ce iylar definition of creed 
7. CA ly 3 blag (co fel an > CAF [yy is rewriting of 5. using 6. 
The following is a derivation of 4c: 

1. Ge Ylar = Ep — Kp definition of creed 
2. Kpy > Kp Kp axiom 4 
3. cr Yl ap `> (Ep > Kp Kpop) prop. reasoning from 1.,2. 
4. Kpọ > (Ep > Kpọ) prop. tautology 
5. K Kpy > K, (Ep > Kpy) normal modal reasoning from 4. 
6. gae gelag > (E = Kci" [el az) prop. reasoning from 3.,5. using 1. 
7. Ep > KC AF ilar = ae ch llar] an definition of creed 
8. Cc gelag > n>, fc; coe [elas], rewriting of 6. using 7. 
The following is a derivation of 5c: 

L CANE lag e Ep AK pe prop. reasoning from the definition of creed 
2. Ep > K Ep axiom P3 
3. pP — Ky7Kpyp axiom 5 
4. JGA elar > Kp(Ep A 7Kz¢) normal modal reasoning from 1.-3. 
5. CAE glag > K,7CA\ [vlan normal modal reasoning from 1.,4. 
6. CAP glag > (E > K,7~CcA\¥ llar) prop. reasoning from 5. 
Ts CE Ylaz— ce [ciir] oe rewriting of 6. 


In addition, this creed belief is factive whenever the speaker type is correctly 


identified (cf. a similar conditional factivity for hope in [10]): 
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Theorem 27. të : EHL H Ep > (Cp [vlan > v). 


Proof. 1. + ce [vl ae = (Ep > Kpọ) definition of creed 
2. Kpp > yp axiom t 
3. F Ep > (ag [vlan > p) prop. reasoning from 1.,2. 


On the other hand, misidentifying the speaker’s type may easily destroy 
factivity. Let p ¢ E. Given that CA ielar > y = (Ep — Kpọ) > ¢, we have 
Ep — Kpọ true simply because Fp is false. Accordingly, there is no reason why 
y must hold. 

This provides a formal model of how a true statement can lead to false 
beliefs due to misinterpretation. Moreover, as Theorem 26 shows, such false 
beliefs cannot be detected by introspection. 


6 Applications 


6.1 Formalizing “The Murders in the Rue Morgue” 


Example 1 describes a situation where honest witnesses provide contradictory 
information that is, nevertheless, successfully filtered by Dupin. We show how 
his reasoning can be formalized and explained using the creed modality. Dupin 
reads all witness accounts from a paper. We assume no misinterpretation of what 
the witnesses said. In addition, the paper mentions the exact type of each wit- 
ness (French not speaking Italian, Dutch not speaking French, etc.), which again 
is assumed to be factive. Hence, we use only one creed modality with the iden- 
tity interpretation function per witness account read by Dupin. In other words, 
Dupin reasons about the available information without the need to interpret it. 
The crucial question is: Why does Dupin ignore some but not all of the infor- 
mation provided by each witness? The answer becomes clear if we view each 
witness account as one or several creed modalities regarding what this witness 
heard from m. Ignoring slight variations in details, all witness statements can be 
divided into two types: (a) m did not speak the language I speak; (b) m spoke 
a language I do not speak. Dupin accepts statements (a) but ignores state- 
ments (b). Even when statement (b) of a witness contradicts statement (a) of 
another witness, Dupin accepts statements (a) from both witnesses. Here is how 
these statements of, say, the French witness f € F regarding the utterance yp 
of m can be represented via the creed modality: 


(a) CRM" 9 = Fm > Kmfrr(y) = Fn > Kml; 
(b) CEM g = Imn > Km frrlp) = Im > Kml. 
Indeed, for (a), since the interpreting function from French to French is mean- 


ingful (in the simplest case, is the identity function), the fact that f could not 
understand what m was saying in this case means that frr(y) = L. On the 
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other hand, for (b), since f does not know Italian, he has frr(w) = T for all w. 
As discussed in Example 13, (a) yields ~Fm. Similarly, (b) yields T as per Exam- 
ple 12. This rightfully leads Dupin to the conclusion ~Fm, i.e., m ¢ F. In other 
words, statements (b) are ignored because they are trivial, not because they are 
false. One might say that, for f, a stronger precondition of m saying something 
in Italian is m € I. But using Im —> Km{Im in place of (b) would yield axiom 
P3, still a logically trivial statement. 

In the story, m was an orangutan (Ourang-Outang in Poe’s spelling), thus, 
fulfilling m ¢ A for any language A discussed. 


6.2 Solution to Knights and Knaves 


Clearly the partition of the island from Example2 involves two types: I for 
knIghts and A for knAves. Let s be the reasoner and L be his type. The puzzle 
postulates that frr(y) = y and fra(y) = 7 for any formula vy. Accordingly, 
the full information agent s receives from agent p’s statement that ọ is 


Cop = CIV p A CEY = (Ip > Kpy) A (Ap > Kpag). 


In the puzzle in question, p states that at least one of p and q is a knave, Ap V Ag 
in formulas. Hence, agent s learns 


Cp (Ap V Aq) = (r > Ky(Ap V A,)) ^ (4, > Kp~(Ap V A,)). (7) 


Here is how to derive in EHL that p is a knight and q is a knave, i.e., Ip A Ag: 


1. A, > K,7(Ap V Aq) prop. reasoning from (7) 
2. Kp—(Ap V Aq) > 7Ap t and prop. reasoning 
3. AA, prop. reasoning since A, — —A, follows from 1. and 2. 
4. =Ap > Ip P1 and prop. reasoning 
5. Ip (MP) from 3. and 4. 
6. Ip > K,(Ap V Aq) prop. reasoning from (7) 
7. Ip > Ap V Ag t and prop. reasoning from 6. 
8. In > Ag prop. reasoning from 7. since I, > —A, by P2 
9. Ip \ Ag prop. reasoning from 5. and 8. 


Hence, EHL F C$ (Ap V Ag) > Ip A Ag. 


6.3 Modelling of Software Updates 


Consider an heterogeneous distributed system with two agent-types, U for the 
updated agents running the most recent software and O for the agents running 
the old protocol, which is designed with the possibility of future updates in 
mind. Since the new protocols are designed by taking into account the existence 
of processes running the old protocol, the interpretation functions can be built 
asymmetrically. Each type interprets information from its own type directly: 
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fuu(v) = y and foo(y) = ¢. U agents can interpret messages from O agents 
using backward compatibility fuo(y~) = g(foo(¥)), where g translates into the 
updated system language. 

The opposite is not always possible as O agents have no knowledge of the 
new protocols. Accordingly, messages y compatible with the old protocol will be 
processed as before, i.e., using foo(y). But if y is unknown to the old protocol, 
i.e., foo(y) = L, the creed under the assumption that sender s € O would yield 
CO +> 70O,. In this case, receiver r can conclude that the sender process s 
does not conform to the old protocol. Since this error flagging disappears when r 
is also updated, however, it may very well be the case that this does not violate 
the fault resilience properties of the old protocol, in particular, when not too 
many processes are updated simultaneously. In this case, r could be guaranteed 
to always compute a correct result. 


6.4 Comparison to Related Work 


The interpretation functions in the knights and knaves puzzles depend on the 
speaker only, which made it possible to formalize them in [20] by means of public 
announcements. In the other two examples (Rue Morgue and software update), 
there is an additional difficulty: even knowing the sender’s type, agents interpret 
messages differently based on the varying levels of knowledge about the sender’s 
protocol. This important degree of freedom of our method compared to [20] is 
especially central to the software update example. 


7 Conclusion and Future Work 


This paper provides a sound and complete axiomatization for a logic for heteroge- 
neous distributed systems that generalizes the logic of fault-tolerant distributed 
systems and enables us to explicitly model the interpretation of messages sent by 
agents that execute different protocols (identified by types). It revolves around 
a (derived) new modality called creed, a generalization of the hope modality for 
byzantine agents, that satisfies positive and negative introspection post message- 
interpretation and enjoys factivity when the sender’s type is correctly identified. 
We demonstrated the explanatory power of our approach by applying it to three 
representative examples from areas ranging from detective reasoning to logic 
puzzles to distributed systems. The current formalization assumes that agents 
knowledge is factive even if this factivity does not affect how they communicated. 
Relaxing this assumption and working with agents whose beliefs may be com- 
promised, e.g., due to sensor errors or memory failures, is a natural next step. 
Another natural extension is to allow for on-the-fly updates to the interpretation 
functions based on received information. 
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Abstract. Clause sets saturated by hierarchic ordered resolution do 
not offer a model representation that can be effectively queried, in 
general. They only offer the guarantee of the existence of a model. 
We present an effective symbolic model construction for saturated con- 
strained Horn clauses. Constraints are in linear arithmetic, the first-order 
part is restricted to a function-free language. The model is constructed 
in finite time, and non-ground clauses can be effectively evaluated with 
respect to the model. Furthermore, we prove that our model construction 
produces the least model. 


Keywords: Bernays-Schénfinkel Fragment - Linear Arithmetic - Horn 
Clauses - Superposition - Model Construction 


1 Introduction 


Constrained Horn Clauses (CHCs) combine logical formulas with constraints 
over various domains, e.g. linear real arithmetic, linear integer arithmetic, equali- 
ties of uninterpreted functions [15]. This formalism has gained widespread atten- 
tion in recent years due to its applications in a variety of fields, including program 
analysis and verification: safety, liveness, and termination [17,38], complexity 
and resource analysis [33], intermediate representation [22], and software test- 
ing [35]. Technical controls, so called Supervisors, like an electronic engine control 
unit, or a lane change assistant in a car [8,9] can be modelled, run, and proven 
safe. Moreover, there exist many different approaches for reasoning in CHCs and 
associated first-order logic fragments extended with theories [2,5,7,10,15,23- 
25,28, 29,34,37]. Thus, CHCs are a powerful tool for reasoning about complex 
systems that involve logical constraints, and they have been used to solve a wide 
range of problems. 

A failed proof attempt of some conjecture or undesired run points to a bug. 
In this case investigation of the cause of the unexpected result or behavior is 
crucial. Building a model of the situation that can then be effectively queried 
is an important means towards a repair. However, some algorithms for CHCs, 
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e.g. hierarchic superposition, which boils down to hierarchic ordered resolution 
in the context of CHCs, do not return a model that can be effectively queried 
if a proof attempt fails, in general. If so, queries are still restricted to ground 
clauses [4]. 

The contribution of our paper can be seen as an extension for these saturation 
based algorithms that produces models and not just saturated clause sets. In 
fact, we show how to build symbolic models out of any saturated CHC clause 
set over linear arithmetic. This fragment is equivalent to Horn clause sets of 
linear arithmetic combined with the Bernays-Schonfinkel fragment. Recall that 
although satisfiability in this fragment is undecidable [16,26], in general, for a 
finitely saturated set we can construct such a representation in finite time. 

Our models fulfill all important properties postulated in the literature for 
automated model building in first-order logic [13,20]. First, they can be effec- 
tively constructed, i.e., each model is represented by one linear arithmetic for- 
mula of finite size for each of its predicates and it can be constructed in finite 
time. Second, they are unique, i.e., the model representation specifies exactly 
one interpretation; in our case the least model. Third, they can be effectively 
queried, i.e., we provide decision procedures that evaluate whether an atom, 
clause, or formula is entailed/satisfied by the model. Fourth, it is possible to test 
the equivalence of two models. The approach we present does not exploit fea- 
tures of linear arithmetic beyond equality, the existence of a well-founded order 
for the theories’ universe, and decidability of the theory. The results may there- 
fore be adapted to other constraint domains. Model representation that can 
be effectively constructed and queried like ours are also called effective model 
representations. Moreover, our method is the first effective model construction 
approach for ordered resolution (or its extension to superposition) that is based 
on saturation, goes beyond ground clauses, and includes theory constraints. In 
the future, we plan to use this approach as the basis for a more general model 
construction approach that also works on more expressive fragments of first-order 
logic modulo theories. 

Our model construction is inspired by the model construction operator used 
in the proof for refutational completeness of hierarchic superposition [3, 6,30]. 
The main difference is that the model construction operator from the refuta- 
tional completeness proof is restricted to ground clauses and executed on the 
potentially infinite ground instances of the saturated clause set (in addition 
to an infinite axiomatization of the background theory as ground clauses). As a 
result, the model construction operator from the refutational completeness proof 
cannot effectively construct a model because iterating over a potentially infinite 
set means it may diverge. Moreover, in contrast to our model construction, the 
original model operator cannot effectively evaluate non-ground atoms, clauses, 
or formulas. It is, however, sufficient, to show the existence of a model if the 
clause set is saturated and does not contain the empty clause [3,6,30]. In our 
version of the model construction operator, we managed to lift the restriction to 
ground clause sets by restricting the input logic to the Horn Bernays-Schonfinkel 
fragment instead of full first-order logic. This enables us to define a strict prop- 
agation/production order for our non-ground clauses instead of just for ground 
clauses. As a result, we can construct the model one clause at a time. 
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The paper is organized as follows. In Sect.2 we clarify notation and 
preliminaries. The main contribution is presented in Sect.3. At the end of 
this section, we also explain how our models satisfy the postulates (see [13, 
Section 5.1, p. 234]) by Fermiiller and Leitsch for automated model building. We 
conclude in Sect. 4. Proofs were elided in favor of explanations and examples. 
An extended version, which includes proofs, can be found at [12]. 


2 Preliminaries and Notation 


We briefly recall the basic logical formalisms and notations we build upon [9]. 
Our starting point is a standard first-order language with variables (denoted 
x,y,z), predicates (denoted P,Q) of some fixed arity, and terms (denoted t, s). 
An atom (denoted A) is an expression P(t1,...,tn) for a predicate P of arity 
n = arity(P). When the terms ¢1,...,¢, in P(t,,...,tn) are not relevant in some 
context, we also write P(x). A positive literal is an atom A and a negative literal 
is a negated atom ~A. We define comp(A) = ~A, comp(—A) = A, |A| = A 
and |7A| = A. Literals are usually denoted L, K. We sometimes write literals as 
[=] P(*), meaning that the sign of the literal is arbitrary, often followed by a case 
distinction. Formulas are defined in the usual way using quantifiers Y, 3 and the 
boolean connectives (in order of decreasing binding strength) =~, V, A, >, and 
«+. The logic we consider does not feature a first-order equality predicate. 

A clause (denoted C, D) is a universally closed disjunction of literals A, V---V 
An V 7B, V---VaABm. We may equivalently write By A+- -A Bm > A, V- VAn. 
A clause is Horn if it contains at most one positive literal, i.e. n < 1. In Sect. 3, 
all clauses considered are Horn clauses. If Y is a term, formula, or a set thereof, 
vars(Y) denotes the set of all variables in Y, and Y is ground if vars(Y) = 9. 
Analogously, I7(Y) is the set of predicate symbols occurring in Y. 

The Bernays-Schénfinkel Clause Fragment (BS) in first-order logic consists 
of first-order clauses where all terms are either variables or constants. The 
Horn Bernays-Schénfinkel Clause Fragment (HBS) is further restricted to Horn 
clauses. 

A substitution o is a function from variables to terms with a finite domain 
and codomain. We denote substitutions by 0,7. The application of substitutions 
is often written postfix, as in xo, and is homomorphically extended to terms, 
atoms, literals, clauses, and quantifier-free formulas. A substitution is ground if 
its codomain is ground. Let Y denote some term, literal, clause, or clause set. A 
substitution ø is a grounding for Y if Yo is ground, and Yo is a ground instance 
of Y in this case. We denote by gnd(Y) the set of all ground instances of Y. 
The most general unifier mgu(Z1, Z2) of two terms/atoms/literals Z, and Z2 is 
defined as usual, and we assume that it does not introduce fresh variables and 
is idempotent. 


2.1 Horn Bernays-Schonfinkel with Linear Arithmetic 


The class HBS(LRA) is the extension of the Horn Bernays-Schoénfinkel frag- 
ment with linear real arithmetic (LRA). Analogously, the classes HBS(LQA) and 
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HBS(LIA) are the extensions of the Horn Bernays-Sch6nfinkel fragment with lin- 
ear rational arithmetic (LQA) and linear integer arithmetic (LIA), respectively. 
The only difference between the three classes are the sort LA their variables and 
terms range over and the universe U over which their interpretations range. As 
the names already imply LA = LRA and U = R for HBS(LRA), LA = LQA and 
U = Q for HBS(LQA), and LA = LIA and U = Z for HBS(LIA). The results 
presented in this paper hold for all three classes and by HBS(LA) we denote 
that we are talking about an arbitrary one of them. 

Linear arithmetic terms are constructed from a set ¥ of variables, the set of 
constants c € Q (if in HBS(LRA) or HBS(LQA)) or c € Z (if in HBS(LIA)), 
and binary function symbols + and — (written infix). Additionally, we allow 
multiplication - if one of the factors is a constant. Multiplication only serves us 
as syntactic sugar to abbreviate other arithmetic terms, e.g., £ + £ + x is abbre- 
viated to 3- x. Atoms in HBS(LA) are either first-order atoms (e.g., P(13,x)) 
or (linear) arithmetic atoms (e.g., x < 42). Arithmetic atoms are denoted by A 
and may use the predicates <, <, %,%,>,>, which are written infix and have 
the expected fixed interpretation. We use ~ instead of = to avoid confusion 
between equality in LA and equality on the meta level. While we do not permit 
quantifiers in the syntax of clauses, the notion of symbolic interpretations that 
we will develop does require this, denoted as usual. By atoms(Y)/quants(Y) we 
denote the linear arithmetic atoms/quantifiers in a formula or set of formulas Y. 
First-order literals and related notation is defined as before. Arithmetic literals 
coincide with arithmetic atoms, since the arithmetic predicates are closed under 
negation, e.g., (a > 42) is equivalent to x < 42. 

HBS(LA) clauses are defined as for HBS but using HBS(LA) atoms. We often 
write clauses in the form A || C where C is a clause solely built of free first-order 
literals and A is a multiset of LA atoms called the constraint of the clause. A 
clause of the form A|| C is therefore also called a constrained clause. Since the 
interpretation of linear arithmetic relations is fixed, we set IT(A|| C) := II(C). 
The fragment we consider in Sect. 3 is restricted even further to abstracted 
clauses: For any clause A || C, all terms in C must be variables. Put differently, 
we disallow any arithmetic function symbols, including numerical constants, in 
C. Variable abstraction, e.g. rewriting x > 3|| P(x, 1) to z > 3,y ~ 1|| P(x,y), is 
always possible. Hence, the restriction to abstracted clauses is not a theoretical 
limitation, but allows us to formulate our model construction operator in a more 
concise way. We assume abstracted clauses for theory development, but we prefer 
non-abstracted clauses in examples for readability, e.g., a unit clause P(3,5) is 
considered in the development of the theory as the clause x ~ 3, y ~ 5 || P(x,y). 

In contrast to other works, e.g. [11], we do not permit first-order constants, 
and consequently also no variables that range over the induced Herbrand uni- 
verse. All variables are arithmetic in the sense that they are interpreted by U. 
Since we only allow equalities in the arithmetic constraint, it is possible to sim- 
ulate variables over first-order constants, by e.g. numbering them, i.e. defining 
a bijection between N and constant symbols. So this again not a theoretical 
limitation. 
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The semantics of A || C is as follows: 


Alc iff (AAC iff (Vave 
AEA AEA 


For example, the clause x > 1Vy % 5V7Q(«x)V R(a, y) is also written z < ly = 
5 || >Q(a) V R(x, y). The negation =(A|| C) of a constrained clause A || C where 
C= A V+ V An V 7B, V ++ V By, is thus equivalent to (Aye, à) A7A1 A 

A 7An A By A+++ A Bm. Note that since the neutral element of conjunction 
is T, an empty constraint is thus valid, i.e. equivalent to true. In analogy to the 
empty clause in settings without constraints, we write [ to mean any and all 
clauses A || L where A is satisfiable, which are all unsatisfiable. 

An assignment for a constraint A is a substitution (denoted 3) that maps all 
variables in vars(A) to values in U. An assignment is a solution for a constraint 
A if all atoms à € (A) evaluate to true. A constraint A is satisfiable if there 
exists a solution for A. Otherwise it is unsatisfiable. 

We assume pure input clause sets because otherwise satisfiability is unde- 
cidable for impure HBS(LA) [21]. This means the only constants of our sort 
LA are concrete rational numbers. Irrational numbers are not allowed by the 
standard definition of the theory. Fractions are not allowed if LA = LIA. Sat- 
isfiability of pure HBS(LA) clause sets is semi-decidable, e.g., using hierarchic 
superposition [3] or SCL(T) [10]. Note that pure HBS(LA) clauses correspond 
to constrained Horn clauses (CHCs) with LA as background theory. 

All arithmetic predicates and functions are interpreted in the usual way 
denoted by the interpretation A. An interpretation of HBS(LA) coincides with 
A“ on arithmetic predicates and functions, and freely interprets non-arithmetic 
predicates. For pure clause sets this is well-defined [3]. Logical satisfaction and 
entailment is defined as usual, and uses similar notation as for HBS. 


Example 1. The clause y > 5,2’ ~ x + 1||So(z,y) > Sı(z',0) is part of a 
timed automaton with two clocks z and y modeled in HBS(LA). It represents a 
transition from state Sp to state Sı that can be traversed only if clock y is at 
least 5 and that resets y to 0 and increases x by 1. 


2.2 Ordering Literals and Clauses 


In order to define redundancy for constrained clauses, we need an order: Let <7 
be a total, well-founded, strict ordering on predicate symbols and let <uų be a 
total, well-founded, strict ordering on the universe U. (Note that < cannot be 
the standard ordering < because it is not well-founded for Z, Q, or R. In the case 
of R, the existence of such an order is even dependent on whether we assume 
the axiom of choice [18].) We extend these orders step by step. First, to atoms, 
i.e., P(a@) < Q(b b) if Pq Q or P=Q, G, be ull, and & ~lex b, where ~<jex is 
the lexicographic extension of <u. Next, we extend the order to literals with a 
strict precedence on the predicate and the polarity, i.e., 


P(t) < ~P EPQ 
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independent of the arguments of the literals. Then, take the multiset extension 
to order clauses. To handle constrained clauses extend the relation such that 
constraint literals (in our case arithmetic literals) are always smaller than first- 
order literals. We conflate the notation of all extensions into the symbol ~ and 
define < as the reflexive closure of <. Note that < is only total for ground 
atoms/literals/clauses, which is sufficient for a hierarchic superposition order [6]. 


Definition 2 (<-maximal Literal). A literal L is called <-maximal in a 
clause C if there exists a grounding substitution o for C, such that there is no 
different L' € C for which Lo < L'o. The literal L is called strictly <-maximal 
if there is no different L’ € C for which Lo < L'o. 


Proposition 3. If < is a predicate-based ordering, C is a Horn clause, C has a 
positive literal L, and L is <-mazimal in C, then L is strictly <-maximal in C. 


Definition 4 (<-maximal Predicate in Clause). A predicate symbol P is 
called (strictly) <-maximal in a clause C if there is a literal [=]P(*) € C that 
is (strictly) <-maximal in C. 


Definition 5. Let N be a set of clauses, < a clause ordering, C a clause, and 
P a predicate symbol. Then NX° := {C' € N | Œ < C} and NZ? := {C EN | 
Q is < -maximal in C and Q < P}. 


2.3 Hierarchic Superposition, Redundancy and Saturation 


For pure HBS(LA) most rules of the (hierarchic) superposition calculus become 
obsolete or can be simplified. In fact, in the HBS(LA) case (hierarchic) super- 
position boils down to (hierarchic) ordered resolution. For a full definition of 
(hierarchic) superposition calculus in the context of linear arithmetic, consider 
SUP(LA) [1]. Here, we will only define its simplified version in the form of the 
hierarchic resolution rule. 


Definition 6 (Hierarchic <-Resolution). Let < be an order on literals and 
Ay || Li VC, 42 || L2V C2 be constrained clauses. The inference rule of hierarchic 
<-resolution is: 


Ay | Lı V Cy Ay | Lo V Cz o = mgu(Lı,comp(Lə2)) 
(Aj, Ag | Cı V C)o 


where Lı is <-maximal in Cı and Lo is <-mazimal in Co. 


Note that in the resolution rule we do not enforce explicitly that the positive 
literal is strictly maximal. This is possible because in the Horn case any positive 
literal is strictly maximal if it is maximal in the clause. 

For saturation, we need a termination condition that defines when the calcu- 
lus under consideration cannot make any further progress. In the case of super- 
position, this notion is that any new inferences are redundant. 
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Definition 7 (Clause Redundancy). A ground clause A||C € N is redun- 
dant with respect to a set N of ground clauses and order < if N*4'\C E A||C. 
A potentially non-ground clause A|| C E€ N is redundant with respect to a poten- 
tially non-ground clause set N and order < if for all A’||C’ € gnd(A||C) the 
clause A’ || C’ is redundant with respect to gnd(N). 


If a clause A || C € N is redundant with respect to a clause set N, then it can 
be removed from N without changing its semantics. If A || C is newly inferred, 
then we also call it redundant if A || C is already part of N. The same cannot be 
said for clauses in N or all clauses in N would be redundant. Determining clause 
redundancy is an undecidable problem [10,40]. However, there are special cases 
of redundant clauses that can be easily checked, e.g., tautologies and subsumed 
clauses. Redundancy also means that Z = N<~4!!° implies Z F A||C if A||C is 
redundant w.r.t. N. We will exploit this fact in the model construction. 


Definition 8 (Saturation). A set of clauses N is saturated up to redundancy 
with respect to some set of inference rules, if application of any rules to clauses 
in N yields a clause that is redundant with respect to N or is contained in N. 


2.4 Interpretations 


In our context, models are interpretations that satisfy (sets of) clauses. The 
standard notion of an interpretation is fairly opaque and interprets a predicate 
P as the potentially infinite set of ground arguments that satisfy P. 


Definition 9 (Interpretation). Let P be a predicate symbol with arity(P) = 
n. Then, PĒ denotes the subset of U" for which the interpretation T maps the 
predicate symbol P to true. 


Since our model construction approach manipulates interpretations directly, 
we need a notion of interpretations that always has a finite representation and 
for which it is possible to decide (in finite time) whether a clause is satisfied by 
the interpretation. Therefore, we rely on the notion of symbolic interpretations: 


Definition 10 (Symbolic Interpretation). Let 21,22,... be an infinite 
sequence of distinct variables, i.e. x; A x; for alll <i < j. (We assume the same 
sequence for all symbolic interpretations in order to prevent conflicts when we 
later combine multiple symbolic interpretations into one.) A symbolic interpre- 
tation S is a function that maps every predicate symbol P with arity(P) = n to 
a formula denoted PS (T) of finite size, constructed using the usual boolean con- 
nectives over LA atoms, where the only free variables appear in © = (x1,...,2n). 
The interpretation Ts corresponding to S is defined by PTS = {(Z)3 | 8 PS(Z)} 
and maps the predicate symbol P to true for the subset of U” which corresponds 
to the solutions of PS (T). 


Example 11. Let N be a clause set consisting of the clauses 0 < x < 2,0 < y < 
2| P(x,y) and zo >ap+1,yq > yp +1||-P(xp, yp) V Q(xq, yQ). An example 
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of a symbolic interpretation S that satisfies N, would be the function that maps 
P to P§ (x1, 22) =0< z1 <2A0< z2 < 2 and QS (z1, £2) = 1 < 2, A1 <2. 
It corresponds to the interpretation Zs where P75 = {(a,,a2) E€ U | 0 < a, < 
2A0 < a< 2} and Q7S = {(a1,a2) EU | 1 <a, A1 < ay}. 


The notion of symbolic interpretations is closely related to A-definable 
models |7, Definition 7] and constrained atomic representations [13, Defini- 
tion 5.1, pp. 236-237]. Each symbolic interpretation S(Z) is equivalent to a 
constrained atomic representation that consists of one constraint atom [[P(Z) : 
P®(2)]] (written in the notation from [13]) for every predicate P. Note that in 
this context the constraint is not just a quantifier-free conjunction of linear arith- 
metic atoms, but a linear arithmetic formula potentially containing quantifiers 
(although those can be eliminated with quantifier elimination techniques). 

Due to the fact that each symbolic interpretation consists of a finite set of 
formulas of finite size, symbolic interpretations can be considered as finite rep- 
resentations. In contrast, the standard representation of an interpretation as a 
potentially infinite set of ground atoms is not a finite representation. However, 
this also means that there are some interpretations for which no corresponding 
symbolic interpretation exists, for instance the set of prime numbers is a sat- 
isfying interpretation for y ~ 2|| P(y), but not expressible as a symbolic inter- 
pretation (in LA). As we will later see, at least any saturated set of HBS(LA) 
clauses either is unsatisfiable or has a symbolic interpretation that satisfies it 
(Theorem 29). 

The top interpretation, denoted Tvr, is defined as P?™ := U” for all predi- 
cate symbols P with arity(P) = n and corresponds to the top symbolic inter- 
pretation, denoted St, defined as PST := T for all predicate symbols P. 
The bottom interpretation (or empty interpretation), denoted Z,, and the bot- 
tom symbolic interpretation (or empty symbolic interpretation), denoted S_, 
are defined analogously. The interpretation of P under Z U J is defined as 
PZJ :— P? U PY for every predicate P. In the symbolic case, S U R is defined 
as PSUR (7) = P8 (a) V P®(#) for every predicate P. We write Z C J or T is 
included in J (resp. T C J or T is strictly included in J) if P? C PY (resp. 
PĒ? Cc P*) for all predicate symbols P. 


Definition 12 (Entailment of Literal). Let T be an interpretation. Given 
a ground literal P(a1,...,Qn), where a; E€ U, we write T F P(ay,...,an) if 
(a1,...,@n) E€ PZ. Conversely, we write T É P(ay,...,@n) if (a1,...,@n) € PF. 
For a non-ground literal L, we write T F L if for all grounding substitutions o 
for L, we have T F Lo. Conversely, we write T ¥ L, if there exists a grounding 
substitution o for L, such that ZF Lo. 


We overload F for symbolic interpretations, i.e. we write S F L and mean 
Ts F L. The following function encodes a clause as an LA formula for evaluation 
under a given symbolic interpretation. 


Definition 13 (Clause Evaluation Function).Let A||C be a constrained 
clause where C = Li V --- V Lm, Li = []Pi(Yi 1,- --, Yini) and let S be a sym- 


bolic interpretation. Then the clause evaluation function (A || oS is defined as 
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follows based on the definitions for oi and ¢; (for 1 < i < m): 


PS L; is positive 
oi = {£j Yij| 1IL <ni m= : : 
= {ay Yes |1 SIS Ms} Me E L; is negative (otherwise) 


(A110) = (A) + (V ds) 


AEA 


Note that the free variables of (A||C)° are exactly the free variables of 
(A|| C). Moreover, the substitutions g; are necessary in the above definition in 
order to map the variables in the symbolic interpretation for the predicates PS 
to the variables that appear as arguments in the literals P;(y11,.-.,Y1,n;)- 


Proposition 14. Given a constrained clause A||C with grounding 3, we have 


E (AIC) — if'and only if SE (A|C)B 


As a corollary of the previous proposition, the entailment S F A||C holds 
if and only if the universal closure of the formula (A||C)° is valid. This means 
that for a symbolic interpretation S it is always computable whether a clause is 
entailed by S because there are decision procedures for quantified LRA, LQA, 
and LIA formulas of finite size. 

We require two functions that manipulate LA-formulas directly to express 
our model construction (cf. Definition 17), i.e. to map solutions for a clause 
defined by a formula vars(@) to one atom inside the clause. This requires from 
us to project away all variables in ¢ that appear in the clause but not in the 
atom. 


Definition 15 (Projection). Let V be a set of variables and $ be an LA- 
formula. The projection function m is defined as follows: 


T(V, p) = 5a1...5a,.6 where {z1,..., £n} = vars(d) \ V 


n(V, p) is a standard projection function that binds a subset V of the variables 
in the formula ¢ with existential quantifiers. Note that we also know that 7(V, ¢) 
is equivalent to a quantifier-free LA formula just over the variables 71,...,%n 
because there exist quantifier elimination algorithms for LRA, LQA, and LIA 
[14,32]. 

A further function Y is needed when we encounter literals of the form 
P(a,x,...), i.e., where one variable is shared among two arguments. In this 
case, we use Y to express in our symbolic interpretation that the equivalent 
argument positions must also be equivalent in our interpretation. 


Definition 16 (Sharing). Let (yi,...,Yn) and (£1,..., £n) be tuples of vari- 
ables with the same length. The sharing function Y, which encodes variable shar- 
ing across different argument positions, is defined as follows: 


A p's Yn), (Bisa ryt) = \ Ti X Tj 
1<i<j <n, yi=yj 
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2.5 Consequence and Least Model 


The notion of a least model is common in logic programming. Horn logic pro- 
grams admit a least model, which is the intersection of all models of the program 
(see [31, § 6, p. 36]). In our context, the least model of a set of clauses N is the 
intersection of all models of N. An alternative characterization of the least model 
of N is through the least fixed point of the one-step consequence operator, which 
we define as Ty for the context of LA constraints analogously to [27, Section 4]. 
The one-step consequence operator Ty takes a set of clauses N and an interpre- 
tation Z as input and returns an interpretation: 


PD = Jing A||>Py (yi) V +++ V Pal) V PY) EN, 
WP | EAB and TE P,(yj)@ forl <i<n 


The least fixed point of this operator exists by Tarski’s Fixed Point Theorem 
[39]: Interpretations form a complete lattice under inclusion (supremum given 
by union, infimum given by intersection), and Ty is monotone. 


3 Model Construction 


In this section we address construction of models for HBS(LA). Throughout this 
section, we consider a set of constrained Horn clauses N and an order < to be 
given. Our aim is to define an interpretation Zy, such that 


In EN if N is saturated and O ¢ N 


Towards that goal, we define the operator 6(S, A || CV P(y)). It takes a symbolic 
interpretation S, and a Horn clause with maximal literal P(y). It results in a 
symbolic interpretation that accounts for A || C” v P(y). 


Definition 17 (Production Operator).Let A||C be a constrained Horn 
clause, where C = C V P(y), P(Y) > C’, and C’ = AP\(y14,---5Ytn1) VV 
Pin (Ymi1s+++>Yminm) Let S be a symbolic interpretation, where the free vari- 
ables of PS are Z and the free variables of PS are ©; (for 1 <i<m). Note that 
n= |g] = |Z] = arity(P). 

The production operator 6(S, A || C) results in a new symbolic interpretation 


PESAIOG) = (allun -yh A AA A (PE) oA Y (9,2) 


acA i=l 
QUE AIO = L for all Q # P where |z| = arity (Q) 


where, to map variables from literal arguments to the variables appearing in the 
symbolic interpretation S and back, we have the substitutions 


o := {y wa; |y €{y1,---,yn} and j is the smallest index s.t. yj = y'} 
a= {zij | Yij| 1 Lj <n} forl<i<m 
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The goal of the operator 6(S,A|| C) is to define an extension of the symbolic 
interpretation S such that SUd(S, A || C) satisfies A || C. Note that ô only extends 
the interpretation over the strictly maximal predicate P. Moreover, due to our 
predicate order, it only needs to consider the interpretation S for predicates 
Q with Q < P. 6 also satisfies the following two symmetrical properties: On 
the one hand, every grounding 7 of A || C” V P(Ẹ) that is not yet satisfied by S 
must correspond to solution @ of P&S:4llC'VP@)) that satisfies P(g)r. On the 
other hand, every solution 3 of P(S |l o'VP(%)) must correspond to a grounding 
of A||C’ V P(y) that is not yet satisfied by S. The first property is needed so 
SUS(S, A|| C” V P(y)) satisfies A || C” V P(y). The second property is needed so 
we do not accidentally extend our interpretation by any solutions not needed to 
satisfy A || C” V P(y). 

Note that in the above statements 3 and 7 are generally not the same because 
the variables # used to define PS are not necessarily the same as the variables 
appearing in the clause A|| C and literal P(y). There are three reasons for this 
that are handled by three different methods in our model construction: 


1. The variables in S and A || C simply do not match, e.g. in PS = zı ~ 0 and 
A||C := yı > 0|| P(y1). This is handled by the substitution ø in 6 that maps 
all variables in P(y) to their appropriate variables in PS, e.g. in the previous 
example ø = {yı + z1} and P8S AIO) = (y, > 0)o = zı > 0. 

2. Not all variables in A||C also appear in P(¥), e.g. in PS := zı ~ 0 and 
A|| C := zı + yit1lAy %0 || P(x). This is handled in 6 by the projection 
operator 7 (Definition 15) that binds all variables that appear in A || C but not 
in P(7), e.g. in the previous example PSAI) := r({y},01 & yi tl Ay & 
0), where m({y1}, £1 S y1 +1 Ayı © 0) = Ay. z1 S yı t1 Ay & 0, which is 
equivalent to xı +1. 

3. Some variables might occur in multiple argument positions, e.g. in A|| C := 
T || P(y1, y1). This case is covered in ô by the sharing function Y (c.f. Def- 
inition 16) that expresses which variables in P®(S-AIC) must map to the 
same value. Continuing the example, Y((y1, Y1), (£1, £2)) = zı % z2 and 
PSSA) (x1, to) = ¥((yis Y1), (£1, £2)). 


The parts of P®S-4Il° that we have not yet discussed are based on the 
fact that any constrained Horn clause A||C’ V P(g) can also be written as an 
implication of the form ¢ — P(¥), where ¢ := AA Py(yia.---, Yn) Ao A 
Pim (Ymj1s+++;Ymynm) and S ¥ A||C’r if and only if S F ¢r. This means the 
groundings 7 of A|| C” not satisfied by S are also the groundings of ¢ satisfied 
by S. It is straightforward to express these groundings with a conjunctive formula 
based on A and the PS. The only challenge is the reverse problem from before, 
i.e. mapping the variables of P to the variables in the literals P;(y11,.--,Y1.n;)- 
This mapping is done in 6 by the substitution o;. 

Now, based on the production operator 6 for one clause, we can use an 
inductive definition over the order ~ to define an interpretation Sy for all clauses 
in N. We distinguish the following auxiliary symbolic interpretations: Sp which 
captures progress up to but excluding the predicate P, Ap which captures how 
P should be interpreted considering S.p, and Sxp which captures progress 
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up to and including the predicate P. The symbolic interpretation A‘! C is the 
extension of Sp w.r.t. the single clause A || C. 


Definition 18 (Model Construction). Let N be a finite set of constrained 
Horn clauses. We define symbolic interpretations Ssp, S<p and Ap for all 
predicates P € IT(N) by mutual induction over <: 


Sap=SgpUAp Sap lj Ag Ap= U aplevre) 
Q<P A|| C'VP(*)EN 
ANE d(S.zp, A||C) if P(Y) maximal in C, and S.p É A||C 
Si otherwise 


Finally, based on the above inductive definition of S.p for every predicate 
symbol P € II(N), we arrive at an overall interpretation for N. 


Definition 19 (Candidate Interpretation). The candidate interpretation 
for N (w.r.t <), denoted Ty, is the interpretation associated with the symbolic 
interpretation Sy = Upemn) Ap where P ranges over all predicate symbols 
occurring in N. 


Note that Sy = Sp where P is <-maximal in M(N). Obviously, we intend 
that Sy E N if N is saturated (Theorem 29). Otherwise, i.e. Sy -F N, we can 
use our construction to find a non-redundant inference (Corollary 30). Consider 
the following two examples, demonstrating how 6 sits at the core of the afore- 
mentioned inductive definitions of symbolic interpretations. 


Example 20 (Dependent Interpretation).Assume P < Q and consider the follow- 
ing set of clauses: 


oo || P(y1, y2) o) 
“Lys È yi +1, y4 > y2 +1 || Ply, y2) > Qly ya) (C2) 


Maximal literals are underlined. Since the maximal literals of Cı and Cù are 
both positive, ordered resolution cannot be applied. The set is saturated. Since 
P is the <-smallest predicate we have S.p = S1. Applying the 6 operator yields 
the following interpretation for P: 


Ps=P = PSPC) (7), £9) =0< t1< 2A0< GQ L 2 


Then, Q is interpreted relative to P. Consider the clause C2: For all solutions 
of its constraint y3 > yı + 1,y4 > yg + 1 our model must also satisfy its logical 
part P(y1, Y2) > Q(ys, y4). The intuition that Q depends on P arises from the 
implication in the logical part. Whenever the constraint of Cy and P(y1, y2) 
are satisfied, Q(y3,y4) must be satisfied. These are exactly the points defined 
through (SQ, C2), based on SQ = SP = b(S.p, C1): 


Q*S<0°92) (t 9) = Iz, 22. £1 > z1 +1 Awe > mt1A0< 2 <2A0< am <2 
=a7>1Aar2>1 
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Whenever the conjuncts 0 < yı < 2 and 0 < y2 < 2 are satisfied, the premise of 
the implication is true, thus there must be a solution to the interpretation of Q, 
additionally abiding the constraint of the clause. Since Q is <-maximal in N, we 
arrive at Sy = SQ = SzP U d(S a, C2) = d(S1, Cı) U (SzP, C2). See Fig. la 
for a visual representation of Sy. 


Example 21 (Unsaturated Clause Set).Assume P < Q and consider the following 
set of clauses: 


N= e <O||/P(y1) (C1), yı <1/1Q(m) oh 
~ lw >Ol|P@) (C2), yı £0] Qu) > Ply) (Ca) 


Maximal literals are underlined. Note that a resolution inference is possible, since 
the maximal literals of C3 and C4 have opposite polarity, use the same predicate 
symbol, and are trivially unifiable. Thus, in this example we consider the effect 
of applying our model construction to a clause set that is not saturated. Since 
P is <-minimal, we start with the following steps: 


S.p=S_, PSPC) (71) = xy <0 
PXS<PC2) (71) = xy >0 PS=P (x1) = 21 <OVa21>0 


Next, we obtain the following results for Q: 


Sq = Sap QSO (a1) = a1 <1 
QSC) (G4) = JL. Q529 (a) =a, <1VL=2<1 


See Fig. 1b for a visual representation of Sy = Sxg. Note that Sy ¥ C4, since 
we have Sy F Q(0) but Sy ¥ P(0). Thus, by using the constructed model, we 
can pinpoint clauses that contradict that N is saturated. Applying resolution to 
C3 and C4 leads to the clause yı < 0|| P(y1) labelled C5. If we then add C; to 
N, we instead get P®<? (21) = xı < 0V z1 >0Vzı <0=T. 


In the following, we clarify some properties of the construction. We provide 
an upper bound for the number of LA atoms and quantifiers in the symbolic 
model for LRA and LQA. Although we do not state it explicitly, the estimate 
for LIA works in a similar way, but due to the higher complexity of LIA quantifier 
elimination, the size of the symbolic model grows triple exponentially [36]. 


Proposition 22. If N is a finite set of LRA/LQA constrained Horn clauses, 
and Sù the result of applying quantifier elimination to Sy then, for every pred- 
icate symbol P € II(N), the number of LA atoms in PSN is in O(m? . 
n2? . (14 a2)1") where n is the max. number of clauses with the same mac. 
predicate, m is the max. number of non-arithmetic literals in a clause, | is the 
maz. number of arithmetic literals in a clause, a is the maz. arity of any pred- 
icate, p = |II(N)|, q is the maz. difference of variables in any clause and its 
positive maximal literal. 


150 M. Bromberger et al. 


P—_o—— 


£ > T1 
—1 0 1 


(b) Result of Example 21. 


(a) Result of Example 20. 


Fig. 1. Visual representation of the models resulting from Examples 20 and 21. 


Corollary 23 (Effective Construction). If N is a finite set of constrained 
Horn clauses then for every predicate P € II(N), PS is a linear arithmetic 
formula of finite size, and can be computed in a finite number of steps. 


We show that all points in PẸ are necessary and justified in some sense, 
that Zy is indeed a model of N, and that Zy is also the least model of N if N 
is saturated. The notion of whether a clause is productive captures whether it 
contributes something to the symbolic interpretation. 


Definition 24 (Productive Clause). Let P be a predicate symbol with 
arity(P) = n. We say that A||C produces P(a1,...,an) if (@1,..-,@n) E 


pasate 


Next, we want to formally express that every element of the resulting inter- 
pretation is justified. Firstly, we express that the operator 6 will produce points 
such that every clause is satisfied whenever necessary, i.e. whenever the maximal 
literal of the clause is P(*) and the maximal literal not satisfied by S.p. 


Proposition 25. Let Ac ||C where C = C' V P(y) and C" < P(y). Let rT be a 


grounding substitution for Ac ||C. If Sp ¥ (Ac || C)r, then E Act and Szp F 
P(y)t, thus S<p F (Ac || C)r. 


Secondly, we express that for every point in PZ, it is justified in the sense 
that there is a clause that produced the point, i.e. this clause would otherwise 
not be satisfied by the resulting interpretation. 


Proposition 26. If Sp = P(a@), then there exists a clause Ac ||C where C = 


C’ V P(y) and C' < P(y¥), and there exists a grounding T for Ac ||C, such that 
P(a@) = P(y)r and Sp ¥ (Ac || C)r. 


Also, observe that once the maximal predicate P of a given clause is inter- 
preted by Sxp, the interpretation of the clause does not change for Sxgq where 
Q> P. 
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Corollary 27. Let P < Q < R, and P be mazimal in clause C. If Sxp F Ac || C 
or SQ = Ac | C, then SR = Ac | C and SR = Ac | C. 


As a result, we know that the full model satisfies N, i.e., Zy F N if every 
clause is satisfied at the point of the construction, where the interpretation of 
its maximal predicate P stays fixed. 


Proposition 28. For every clause Ac ||C € N with maximal predicate P, if 
SP E Ac IC, then TNE N. 


With the above propositions (and some auxiliary properties that can be found 
in [12]) we show that indeed Zy F N if N is saturated and does not contain the 
empty clause. 


Theorem 29. Let < be a clause ordering and N be a set of constrained Horn 
clauses. If (1.) N is saturated w.r.t. <-resolution, and (2.) Og N, then TyF N. 


For clauses with positive maximal literal, the fact that they are satisfied 
by Zy follows from Proposition 25. For clauses with maximal literal ~P(*), we 
prove this theorem by contradiction: If there is a minimal clause Ac || C such 
that Sy É Ac||C. We can then exploit Proposition 26 to find the smallest 
clause Ap || D that produced the respective instance P(@). Applying hierarchic 
<-resolution to Ag ||C and Ap || D then yields a non-redundant clause. This 
idea then leads to the following theorem. 


Corollary 30. Let < be a clause ordering and N be a set of constrained Horn 
clauses. If (1.) In F N, and (2.)0 Z N, then there exist two clauses Ac || C, 
Ap|| D € N such that: (1.) Ac ||C is the smallest clause not satisfied by Ty, 
i.e. there exists a grounding T such that In F (Ac||C)r, but there does not 
exist a clause Ac: |C" € N with grounding T', such that In F (Ac || Cr 
and (Aq || C)r < (Ac ||C)r, (2.)-P(a) is the maximal literal of (Ac || C)r, 
(3.)Ap || D is the minimal clause that produces P(@), (4.)<-resolution is appli- 
cable to Ag ||C and Ap || D, and (5.)the resolvent of Ac || C and Ap || D is not 
redundant w.r.t. N. 


Additionally, we show that Zy is the least model of N, establishing a connec- 
tion between our approach and the literature on constrained Horn clauses (see 
(27, Section 4] and [15, Section 2.4.1]) and logic programming (see [31, § 6, p. 37]). 


Theorem 31. Zy is the least model of N. 


Fermiiller and Leitsch define four postulates (see [19] as cited in [13, 
Section 5.1, p. 234]) regarding automated model building. In the following, we 
instantiate the postulates for our setting. By G(N) we denote the set of all sym- 
bolic interpretations of the set of constrained Horn clauses N. We argue how our 
approach satisfies all postulates, one by one: 


Uniqueness. Each element of G(N) specifies a single interpretation of N. 
We have shown (cf. Theorem 31) that Zy, the model represented by Sy, is 
the least model of N, which is unique. 
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Atom Test. There exists a fast procedure to evaluate arbitrary ground atoms 
over IT(.N) in the interpretation defined by a S in G(N). 
This is a special case of clause evaluation (cf. Proposition 14): A ground atom 
P(t) is true in S if and only if E P§(#){a; ti | 1 <i < |z| = |¢]}. Fulfillment 
of this property thus hinges on the meaning of “fast”. We consider methods 
for evaluating formulas of LA against points to be fast. 

Formula Evaluation. There exists an algorithm deciding the truth values of 
arbitrary formulas in interpretations defined by S € G(N). 
Proposition 14 states that evaluating a constrained clause A||C is achieved 
by evaluating the universal closure of (A || C)°, which is decided by quantifier 
elimination algorithms for LRA, LQA, and LIA [14,32]. For sets of clauses, 
evaluate each clause individually and combine the results conjunctively. 

Equivalence Test. There exists an algorithm which decides whether two repre- 
sentations Sı and S2 in G(N) describe the same interpretation. 
Sı and Sə describe the same interpretation if and only if for each predicate 
P € II(N) of arity n, we have Vx1...V2n. P% (T) => P5?(z). 


4 Conclusion 


We have presented the first model construction approach to Horn clauses with 
linear arithmetic constraints based on hierarchic ordered resolution, (cf. Defini- 
tion 19). The linear arithmetic constraints may range over the reals, rationals, or 
integers. The computed model is the canonical least model of the saturated Horn 
clause set (cf. Theorem 31). Clauses can be effectively evaluated with respect to 
the model (cf. Proposition 14). This offers a way to explore the properties of a 
saturated clause set, e.g., if the set represents a failed refutation attempt. 


Future Work. It is straightforward to see that any symbolic LQA model is 
also a symbolic LRA model. (This holds due to convexity of conjunctions of 
ground LQA atoms.) So even if the axiom of choice is not assumed, there is 
an alternative way to obtain a model for a HBS(LRA) clause set: Simply treat 
it as an HBS(LQA) clause set, saturate it and construct its model based on 
HBS(LQA). 

In this work, we restrict ourselves to only one sort LA per set of clauses. An 
extension to a many-sorted setup, e.g. including first-order variables with sort 
F is possible. This can even be simulated, by encoding first-order constants as 
concrete natural numbers via a bijection to N, since N C U. By not placing any 
arithmetic constraints on the variables used for the encoding, it can be read off 
and mapped back from the resulting model. 

One obvious challenge is relaxation of the restriction to Horn clauses. With 
respect to ordered resolution saturation there is typically no difference in the 
sense that if a Horn fragment can always be finitely saturated, so can the non- 
Horn fragment be. However, our proposed ordering for the model construction at 
the granularity of predicate symbols will not suffice in this general case, and the 
key to overcome this challenge seems to be the appropriate treatment of clauses 
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with maximal literals of the same predicate. Backtracking on the selection of 
literals might also be sufficient. 

The approach we presented does not exploit features of linear arithmetic 
beyond equality and the existence of a well-founded order for the underlying 
universe U. The results may therefore be adapted to other constraint domains 
such as non-linear arithmetic. 


Acknowledgements. We thank our reviewers for their constructive comments. 
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Abstract. This work is a part of an ongoing effort to understand the relationships 
between properties used in theory combination. We here focus on including two 
properties that are related to shiny theories: the finite model property and stable 
finiteness. For any combination of properties, we consider the question of whether 
there exists a theory that exhibits it. When there is, we provide an example with 
the simplest possible signature. One particular class of interest includes theories 
with the finite model property that are not finitely witnessable. To construct such 
theories, we utilize the Busy Beaver function. 


Keywords: satisfiability modulo theories - theory combination - theory 
politeness - theory shininess 


1 Introduction 


The story of this paper begins with [7], where it was shown that the theory of algebraic 
datatypes, useful for modeling data structures like lists and trees, can be combined with 
any other theory, using the polite combination method [6]. This combination method 
offers a way to combine decisions procedures of two theories into a decision procedure 
for the combined theory, with different assumptions than those of the earlier Nelson- 
Oppen approach [4]. In particular, it was proven that the theory admits a technical prop- 
erty concerning cardinalities of models, called strong politeness [2]. It was noted in [7] 
that proving strong politeness for this theory seemed much harder than proving polite- 
ness, a similar but simpler property. Therefore, the proof was split into three steps: 
(i) a class of theories was identified in which politeness and strong politeness coincide; 
(ii) the theory of algebraic datatypes was shown to be in this class; and (iii) this theory 
was proven to be polite. This proof technique raised the following question: does polite- 
ness imply strong politeness? An affirmative answer to this question would simplify 
strong politeness proofs that follow such steps, as only the last step would be needed. 
Unfortunately, the answer to this question was shown in [8] to be negative, in its most 
general form. However, an affirmative answer was given for theories over one-sorted 
empty signatures, where politeness and strong politeness do coincide. 

Seeing that relationships between model-theoretic properties of theories (like polite- 
ness and strong politeness) are non-trivial, and can have a big impact on proofs in the 
© The Author(s) 2023 
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field of theory combination, we have recently initiated a more general research plan: 
to systematically determine the relationships between model-theoretic properties that 
relate to theory combination. An analysis of such properties can, for example, simplify 
proofs, in cases where a property follows from a combination of other properties. 

In the first stage of this plan [10], we studied the relationships between all properties 
that relate to either polite or Nelson-Oppen combination, namely: stable infiniteness, 
smoothness, finite witnessability, strong finite witnessability, and convexity. The first 
two properties relate to the ability to enlarge cardinalities of models, while the next two 
require a computable witness function that restricts the models of a formula based on its 
variables. The last property relies on the ability to deduce an equality from a disjunction 
of equalities. The result of [10] was a comprehensive table: nearly every combination 
of these properties (e.g., theories that are smooth and stably infinite but do not admit 
the other properties) was either proved to be infeasible, or an example for it was given. 

In this paper we continue with this plan by adding two properties: the finite model 
property and stable finiteness, both related to shiny theories [9]. The former requires 
finite models for satisfiable formulas, and the latter enforces bounds on them. 

Of course, the theories from [10] can be reused. For these, one only needs to deter- 
mine if they admit the finite model property and/or stable finiteness. The results and 
examples from [10] are, however, not enough. Given that the number of considered 
combinations is doubled with the addition of each property, new theories need to be 
introduced in order to exemplify the new possibilities, and new impossible combina- 
tions can be found. Hence, in this paper we provide several impossibility results for the 
aforementioned properties, as well as examples of theories for possible combinations. 
The overall result is a new table which extends that of [10] with two new columns 
corresponding to the finite model property and stable finiteness.! 

The most interesting combinations that we study are theories that admit the finite 
model property but not finite witnessability. While both properties deal with finite mod- 
els, the latter has a computable element to it, namely the witness function. In separat- 
ing these properties, we found it useful to define theories that are based on the Busy 
Beaver function, a well known function from computability theory, that is not only 
non-computable, but also grows eventually faster than any computable function. 


Outline: Sect.2 reviews many-sorted logics and theory combination properties. 
Section 3 identifies combinations that are contradictory; Sect. 4 constructs the extended 
table of combinations, and describes the newly introduced theories. Section 5 gives final 
remarks and future directions this work can take. The proofs for the results in this paper 
may be found in an appendix to a preprint version of this work, available as [11]. 


2 Preliminary Notions 


2.1 Many-Sorted Logic 


A many-sorted signature X is a triple (Sy, Fs, Ps) where: Ss is a countable set of 
sorts; Fs is a countable set of function symbols; and P ș is a countable set of predicate 


' While we use several results from [10], we do not assume here any familiarity with that paper. 
All required results are mentioned here explicitly. 


Combining Finite Combination Properties: Finite Models and Busy Beavers 161 


symbols containing, foreach o € Sy, an equality =,. When a is clear from the context, 
we write =. Every function symbol has an arity of the form a; x -+ X On — a, and 
every predicate symbol one of the form a, X -+ X On, Where 01,...,0n,0 E Sy; 
equalities =, have arity o x o. 

A signature that has no functions and only the equalities as predicates is called 
empty. Many-sorted signatures X where Sy has only one element are called one-sorted. 

For any sort in Ss we assume a countably infinite set of variables, and distinct sorts 
have disjoint sets of variables; we then define first-order terms, formulas, and literals in 
the usual way. The set of free variables of sort ø in a formula y is denoted by vars,(y), 
while vars(y) will denote Upes, Varso (9). 

X-Structures A are defined as usual, by interpreting sorts (denoted by o), func- 
tions (f“) and predicate symbols (P^), with the restrictions that equality symbols are 
interpreted as identities. A 4/-interpretation A is an extension of a 5/-structure A with 
interpretations to variables. If A is the underlying 5/-structure of a 5/-interpretation A, 
we say that A is an interpretation on A. For simplicity, and because the use of struc- 
tures is sparse in this paper, we will usually denote both structures and interpretations 
by using the same font, A, B and so on. a^ is the value taken by a X-term a in a 
5)-interpretation A, and if I is a set of terms, we simply write l^ for {a^ : a € I}. 

We write A E y if the X-interpretation A satisfies the X-formula y; is then said 
to be satisfiable if it is satisfied by some interpretation A. The formulas found in Fig. 1 
will be useful in the sequel. A -interpretation A: satisfies YZ, iff |74] > n; satisfies 


WE, iff |o4| < n; and satisfies YZ, iff |o4| = n. For simplicity, when dealing with 


=n 


one-sorted signatures, we may drop the sort o from the cardinality formulas. 


Fig. 1. Cardinality Formulas. T stands for @1,...,2n, all variables of sort o. 


A »/-theory T is a class of all 5/-interpretations (called 7 -interpretations) that sat- 
isfy some set Ax(T ) of closed formulas called the axiomatization of T; the structures 
underlying these interpretations will be called the models of T. 

A formula is T -satisfiable if it is satisfied by some 7 -interpretation and, analo- 
gously, a set of formulas is 7 -satisfiable if there is a J -interpretation that satisfies all of 
them simultaneously. Two formulas are 7 -equivalent when a T -interpretation satisfies 
the first iff it satisfies the second. We write =y y, and say that y is T-valid if AF 
for all 7 -interpretations A. 


2.2 Theory Combination Properties 


Let X be a signature, 7 a X-theory and S C Sy. We define several properties 7 may 
have with respect to S. 
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Convexity, Stable Infiniteness, and Smoothness 7 is convex with respect to S if for 
any conjunction of 5/-literals ¢ and any finite set of variables {u1,01,...,Un, Un} of 
sorts in S with Ez ¢ > Vi, u; = v;, one has =y ¢ > u; = v; for some i. T 
is stably infinite with respect to S if for every 7 -satisfiable quantifier-free X-formula 
there is a T-interpretation A satisfying it such that |o“| is infinite for each o € S. T 
is smooth with respect to S if for every quantifier-free formula, 7 -interpretation A that 
satisfies it, and function « from S to the class of cardinals such that x(a) > |oA| for 
each ø € S, there is a T-interpretation B that satisfies it with |o8| = «(c) for each 
a eS. 
(Strong) Finite witnessability For finite sets of variables V, of sort ø for each o € S, 
and equivalence relations &, on V,, the arrangement on V = Use s Vo induced by 
E = Uses Eo, denoted by dy or 677, is the formula dy = Ases [Arg y (€ =y)A 
NeEey W(t = y)], where Es denotes the complement of the equivalence relation Es. 

T is finitely witnessable with respect to S when there exists a computable function 
wit, called a witness, from the quantifier-free 5’-formulas to themselves that satisfies, 
for every ¢: (i) ọ and IW. wit(¢) are T-equivalent, for W = vars(wit(@)) \ vars(@); 
and (ii) if wit(@) is T-satisfiable, there exists a J-interpretation A satisfying wit(¢) 
such that 04 = vars, (wit(@))4 for each ø € S. 

Strong finite witnessability is defined similarly to finite witnessability, replacing (ii) 
by: (ii)” given a finite set of variables V and an arrangement dy on V, if wit(d) A dy 
is T-satisfiable, there exists a T-interpretation A that satisfies wit(¢) A ôy with c^ = 


vars, (wit(d) A by)” for allo € S. If T is smooth and (strongly) finitely witnessable 
with respect to S, then it is (strongly) polite with respect to S. 


Finite Model Property and Stable Finiteness 7 has the finite model property with 
respect to S if for every quantifier-free T-satisfiable X-formula, there exists a 7- 
interpretation A that satisfies it with |o“| finite for each o € S. T is stably finite with 
respect to S if, for every quantifier-free X-formula and 7 -interpretation A that satisfies 
it, there exists a J -interpretation B that satisfies it with: |? | finite for each o € S; and 
|a| < |o^] for each ø € S. Clearly, stable finiteness implies the finite model property: 


Theorem 1. [fT is stably finite w.r.t. S, then it has the finite model property w.rt. S. 


We shall write SI for stably infinite; SM for smooth; FW (SW) for (strong) finitely 
witnessable; CV for convex; FM for the finite model property; and SF for stably finite. 


3 Relationships Between Model-Theoretic Properties 


In this section we study the connections between finiteness properties related to the- 
ory combination: the finite model property, stable finiteness, finite witnessability, and 
strong finite witnessability. We show how these properties are related to one another. In 
Sect. 3.1, we provide general results that hold for all signatures. Then, in Sect. 3.2, we 
focus on empty signatures, in which we are able to find more connections. 


3.1 General Signatures 


Finite witnessability, as well as its strong variant, were introduced in the context of 
polite theory combination. In contrast, the study of shiny theories utilizes the notions of 
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the finite model property, as well as stable finiteness. It was shown in [1] that for the- 
ories with a decidable quantifier-free satisfiability problem, shiny theories and strongly 
polite theories are one and the same. This already showed some connections between 
the aforementioned finiteness properties. However, that analysis also relied on smooth- 
ness, the decidability of the quantifier-free satisfiability problem of the studied theories, 
as well as the computability of the mincard function, the function that computes the 
minimal sizes of domains in models of a given formula in these theories. 

Here we focus purely on the finiteness properties, and show that even without any 
other assumptions, they are closely related. Considering finite witnessability and the 
finite model property, notice that any witness ensures that some formulas always have 
finite models. Using the equivalence of the existential closure of such formulas to the 
formulas that are given to the witness, one gets the following result, according to which 
finite witnessability implies the finite model property. 


Theorem 2. Any X-theory T finitely witnessable with respect to S C Sy also has the 
finite model property with respect to S. 


Strong finite witnessability is a stronger property than finite witnessability, obtained 
by requiring finite models in the presence of arrangements. This requirement allows one 
to conclude stable finiteness for it, as the finer control on cardinalities that is required 
for stable finiteness can be achieved with the aid of arrangements. The following result 
is proved in Lemma 3.6 of [1], although under the assumption that the theory is smooth, 
something that is not actually used in their proof. 


Theorem 3. Any X-theory T strongly finitely witnessable with respect to S C Ss is 
also stably finite with respect to S. 


Clearly, stable finiteness implies the finite model property (Theorem 1). The con- 
verse does not generally hold, as we will see in Sect. 4. However, when these properties 
are considered with respect to a single sort, they actually coincide: 


Theorem 4. [fa 5/-theory T has the finite model property with respect to a set of sorts 
S with |S| = 1, then T is also stably finite with respect to S. 


Theorems 2 and 3 are visualized in the Venn diagram of Fig. 2, where, for exam- 
ple, theories that are strongly finitely witnessable are clearly inside the intersection of 
finitely witnessable theories and stably finite theories. 

When only one sort is considered, the picture is much simpler, and is described in 
Fig. 3. There, the finite model property and stable finiteness populate the same region, 
as ensured by Theorem 4. Notice that the results depicted in Fig. 3 hold for one-sorted 
and many-sorted signatures. The key thing is that the properties are all w.r.t. one of the 
sorts. 


3.2 Empty Signatures 


Figures 2 and 3 show a complete picture of the relationships between the properties 
studied in this section, for arbitrary signatures. However, when this generality is relaxed, 
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FM 


© 


Fig. 2. Finiteness properties: general case. Fig. 3. Finiteness properties w.r.t. one sort. 


several other connections appear. For this section, we require that the signatures are 
empty, and that they have a finite set of sorts. We further require that the properties in 
question hold for the entire set of sorts, not for any subset of it. 

Table 1 defines the 5 signatures that will be used in the examples found in Sect. 4, 
and that will also appear in some of the results shown below: the empty signatures %3, 
Xə and X3, with sets of sorts {0}, {0, o2} and {0, 02, c3 }, respectively; and the signa- 
tures X, and X? with one function s of arity ¢ — c, and sets of sorts {0} and {0, c2}, 
respectively. Notice these are the simplest possible signatures when we order those by 
establishing: first, that the signature with fewer sorts is simpler; and second, that if two 
signatures have the same number of sorts, the one with fewer function symbols is sim- 
pler. We are free not to consider predicates, as they are at least as expressive as functions 
themselves; furthermore, we do not consider the problem of defining which of two sig- 
natures with the same numbers of sorts and function symbols is simpler, choosing rather 
to add only functions from a sort to itself. 


Table 1. Signatures that will be used throughout the paper. 


Signature | Sorts Function Symbols 
X {o} Ø 

X2 {o, 02} ø 

X3 {0, 02,03} |0 

Xs {o} {s:a— oa} 

x? {0,02} {s:o >o} 


First, in such a setting, we have that the finite model property implies finite witness- 
ability, in the presence of smoothness. 


Theorem 5. /f X is an empty signature with a finite set of sorts Sy, and the X-theory 
T has the finite model property and is smooth with respect to Sy, then T is also finitely 
witnessable with respect to Sy. 


Next, we show that stable finiteness and smoothness together, imply strong finite 
witnessability. 
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SM 


FW (SW) 


FM (SF) 


Fig. 4. Interplay between SM, FW (SW) and FM (SF) w.r.t. Ss in an empty signature. 


Theorem 6. If X is an empty signature with a finite set of sorts Ss, and the 3/-theory T 
is stably finite and smooth with respect to Ss, then T is also strongly finitely witnessable 
with respect to Sy. 


While Theorem 2 and Theorem 3 establish certain unconditional relations between 
finite witnessability and the finite model property, and strong finite witnessability and 
stable finiteness, the converses shown to hold in Theorem 5 and Theorem 6 demand 
smoothness and that the properties hold with respect to the entire set of sorts. In that 
case, the situation can be represented by the diagram found in Fig. 4, showing clearly 
that a smooth theory that also has the finite model property (respectively, is stably 
finite), cannot not be finitely witnessable (strongly finitely witnessable). 

Lastly, regarding the empty signatures X1, X2 and %3, the following theorem shows 
that X3 is sometimes necessary. 


Theorem 7. There are no Xi or Xə-theories T that are, simultaneously, neither stably 
infinite nor stably finite, but are convex and have the finite model property, with respect 
to the entire set of their sorts. 


Hence, to exhibit such theories, one has to consider three-sorted theories. 


4 A Taxonomy of Examples 


In [10], we have created a table, in which for every possible combinations of properties 
from { SI, SM, FW, SW, CV } we either gave an example of a theory in this combi- 
nation, or proved a theorem that shows there is no such example, with the exception of 
theories that are stably infinite and strongly finitely witnessable but not smooth. Such 
theories, referred to in [10] as Unicorn Theories (due to our conjecture that they do not 
exist) were left for future work, and are still left for future work, as the focus of the 
current paper is the integration of finiteness properties, namely FM and SF to the table. 

And indeed, the goal of this section is to add two columns to the table from [10]: 
one for the finite model property and one for stable finiteness. The extended table is 
Table 2. We do not assume familiarity with [10], and describe the entire resulting table 
(though focusing on the new results). 


166 G. V. Toledo et al. 


Table 2. Summary of all possible combinations of theory properties. Red cells represent impos- 
sible combinations. In lines 26 and 34, n > 1; in lines 29, 30 and 35, m > 1, > 1 and 
|m- n| > 1. 


Empty Non-empty 
SI |SM | FW SW CV)|FM| SF || One-sorted | Many-sorted | One-sorted | Many-sorted | N° 
ge ETF Ton (Ton)? (Tan)s (Ten) )s | 1 
F|T|T [10] (Zen)v | (Fen)")v_| 2 
T T T Theorem 6 Ty (Ty) s 3 
F F Theorem 4 T2,3 Theorem 4 (T2,3)s 4 
T Tf Tf)? 5 
F | T = [10] f (TF) 
T F Theorem 4 (Ta,3)v 6 
T Te T= 7 
T Theorem 5 > = 
T F Theorem 4 T 8 
2 2 
ele Fh = (Tx) Co) | P | 9 
AE T aP Ju 
F F [10] Theorem 4 TN 11 
T F|? (Zao)v ((Too)*)v_| 12 
TIEF 13 
T Unicorn i—— 
FEP i! 14 
i | 2 | Te (TS)? | Ca) | E | 15 
F F Theorem 4 T? Theorem 4 (T~) 16 
T = Tea) 17 
re xe (Tein) | ORN 
F F Theorem 4 (T®)v 18 
PE T (T)? (T), (TPs |19 
T F Theorem 4 T” Theorem 4 (T° Js 20 
rae F|F Tn,oo (Ta)? (Injoo)s | ((Zn,co)”)s | 21 
ele (Tov (TP |2 
F F [10] Theorem 4 (T )v 23 
F| F (Tno)v | (Ta) )v | 24 
T A T T Tzi (T21) (T<1)s ((T<1)?)s 25 
FJTJ|T Tzn (Tzn)? (Tzn)s ((Zen)*)s_| 26 
T a T [10] God Te, (Be); 27 
F F Theorem 4 TÈ 3 Theorem 4 TE 28 
ror E| Tom | imm)? | imm) | (Tnn), |29 
F F F Theorem 4 Th Theorem 4 (Tne 30 
Pa: Tf TŽ Gy |31 
T F TRR Theorem 4 TÈ 32 
F| F T° Ty. T°). | 33 
FF [10] 1 Lise (Te) 
T T; T: (Ti): |34 
F F Tin Theorem 4 (Fine 35 
F|F Ts Ti (T°). |36 


This section is structured as follows: In Sect.4.1 we describe the structure of 
Table 2. In Sects. 4.2 to 4.4 we provide details about the axiomatizations of theories 
that populate it. Finally, in Sect. 4.5, we reuse operators from [10], prove that they pre- 
serve the finite model property and stable finiteness, and show how they are used in 
order to generate more theories for Table 2. 
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4.1 The Table 


The columns left to the vertical double-line of Table 2 correspond to possible combi- 
nations of properties. In them, T means that the property holds, while F means that 
it does not. The first 5 columns correspond to properties already studied in [10], and 
the next two columns correspond to FM and SF. The columns right to the vertical 
double-line correspond to possible signatures: empty or non-empty, and one-sorted or 
many-sorted. White cells correspond to cases where a theory with the combination of 
properties induced by the row exists in a signature that is induced by the column. In such 
a case, the name of the theory is written. The theories themselves are defined in Figs. 5, 
7 and 8, axiomatically. Shaded correspond to the cases where there is no such theory. In 
such a case, the theorem that excludes this possibility is written. If that theorem is from 
[10], we simply write [10]. 


Example 1. Line 1 of Table 2 corresponds to theories that admit all studied properties. 
We see that there is such a theory in each of the studied types of signatures (e.g., for 
the empty one-sorted signature, the theory 75n exhibits all properties). In contrast, line 
3 corresponds to theories that admit all properties but strong finite witnessability. We 
see that such theories exist in non-empty signatures, but not in empty signatures. This 
is thanks to Theorem 6. 


Section 3, as well as results from [10], make some potential rows of Table 2 com- 
pletely shaded. To allow this table to fit a single page, we chose to erase such rows. 
For example, by Theorem 1, there are no theories that are stably finite but do not have 
the finite model property, in any signature. Thus, no rows that represent such theories 
appear in the table. 

In the remainder of this section, we describe the various theories that populate the 
cells of the table. Fortunately, all theories from [10] can be reused to exhibit also the new 
properties SF and FM, or their negations. These are described in Sect. 4.2. However, 
the theories from [10] alone are not enough. Hence we introduce several new theories 
in Sects. 4.3 and 4.4. Some of them are relatively simple, and are described in Sect. 4.3. 
Most of them, however, are more complex, and rely on the Busy Beaver function from 
theoretical computer science. We discuss these theories in Sect. 4.4. 


4.2 Theories from [10] 


For completeness, we include in Fig.5 the axiomatizations of all theories from [10] 
that are used in Table 2 (Fig. 6 includes the definitions of formulas that are abbrevi- 
ated in Fig.5, such as ~5,, from the definition of Ty). For lack of space, however, 
we refrain from elaborating on these theories, and refer the reader to their detailed 
description in [10]. For the theories of Fig.5, whether they admit the properties from 
{SI, SM, FW, SW, CV} or not was already established in [10]. For each of them, here, 
we also check and prove whether they admit the new properties FM and SF. 
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Name) Sig. Axiomatization 
Ton | X41 {wsn} Name|Sig. Axiomatization 
Toven | X1 | {rpa2nd1: k € N} T23 | D2 {(VZ2 A YSF) V (WS3 A YSR) : k €N} 
Too | Æi {w>n ik EN} Tx? | X2 {V22} U {VS} : k € N} 
Taco | Ei {pan V Von BEN} TP" E2] {Ph} UL, BEN} 
Ten | X1 {v<n} TP | Zo {Yl} U {Y3 : k EN} 
Tmn| 21| {pam V Y=n} 


Name) Sig. Axiomatization 
Ty | 2s nw AVE nl Y VicalPEn a A YZ yo] | EN \ {OF} 
Tp | Xs Ax(Ty) U {wv} 

Tř | Xs {[p=a2 A Y z. p(£)] V [Poe A Yz. =p(x)] : k € N} 

Tla | Zs {bar V Pyar AV x. ap(a)] : k €N} 

TE oo | Xs {war V [Won A Yz. ap(x)]: k € N} 


Fig. 5. Theories for Table 2 that were studied in [10]; p(x) stands for s(x) = x. In Ty, f is any 
non-computable function from the positive integers to {0, 1}, such that for every k > 0, f maps 
half of the numbers between 1 and 2* to 1, and the other half to 0. In [10], such a function was 
proven to exist. 


ban =3F. A pli) An bin =I F. [N plas) A ôn AV. [p(z) > Vz=a:] 


i=l i=l 


n n n 


a = g T. A ap(xi)Abn Ča = g EAA ap(xi)Adn AV z. p(x) -$ V — xil] 
i=1 i=1 i=1 
Ww =V«. [(s(s(x)) = z) V (s(s(a)) = s(x))] 
Fig. 6. Formulas for X,-theories. X stands for £1, ... , £n. Ôn stands for Miia a(x; = 25), 


and p(x) stands for s(x) = x. 


For example, for each n, T>n consists of all X1-structures that have at least n ele- 
ments. This theory was shown in [10] to be strongly finitely witnessable, and so by 
Theorem 3 it is also stably finite. Then, by Theorem 1, it also admits the finite model 
property. 

It is worth mentioning that 723 was first introduced in [1], in the context of shiny 
theories, where it was shown to have the finite model property, while not being stably 
finite. An alternative proof of this fact goes as follows: it was proven in [8] that 72,3 
is: (i) finitely witnessable; (ii) not strongly finitely witnessable; and (iii) smooth. By 


Combining Finite Combination Properties: Finite Models and Busy Beavers 169 


Name] Signature Axiomatization 

Te) 35 {(b21 A WS.) V diag”? (k + 2) : k € N} 

Tmn| 22 (Ge maxima V (Comintern ^ Yo) : k EN} 

TZ X2 {YZ A Si.) V (diag? (k +2) AV x. =p(x) : k € N} 
Ts | Ds | (BP LL AVR) V WSs AWS): k EN} 


Fig. 7. Simple theories for Table 2. diag”? (k + 2), for any k € N, stands for the formula 
(Werte A VSi.42) V yE (YZ; A Y2), and p(x) stands for s(x) = x. 


Theorem 2 and (i), it also has the finite model property. But since it is over an empty 
signature, by (ii), (iii) and Theorem 6, we have that it cannot be stably finite. 


4.3 New Theories: The Simple Cases 


While the theories from Fig. 5 suffice to populate many cells of Table 2, they are not 
enough. Hence we describe new theories, not taken from [10]. The simplest theories 
that we have added can be found in Fig. 7, and are described below. 

T° is a theory with three distinct groups of models: its first group consists of mod- 
els A that have |o“| = 1 and o¢* infinite; its second group, of models A where both 
o and co! are infinite; and its third group, of models A where |o“| = |o3'| is any 
value k > 2. In its axiomatization, one finds the formula diag”? (k + 2), equal to 
(WS nro AYS) V Wee ie: A #22) for k € N: that formula characterizes the mod- 
els A of T° that lie in the diagonal, that is, where |o“| = |o3'| (and this value is 
greater than 1), or both are infinite. 

Tn is a theory that depends on two distinct positive integers m and n, and without 
loss of generality let us suppose m > n, when the theory has two types of models A: 
in the first, |o“| equals m, while o% can be anything; in the second, |7“| equals n, and 
then 3! must be infinite. 

The models A of the X2-theory T2 have either: |o4| = 1, |o3!| > w and s^ the 
identity function; both c^ and o% infinite, and s^ with no fixed points; or |o4| = |o3!| 
equal to any number in N \ {0,1}, and again s^ with no fixed points. 

Finally, TPs is made up of just the models A of T2 3 (see Fig. 5) with an extra 


domain associated to the new sort 73 such that |oz'| = 1. 


4.4 New Theories: The Busy Beaver 


So far we have seen that the theories from [10], together with a small set of simple 
new theories, can already get us quite far in filling Table 2. However, for several com- 
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Name] Signature Axiomatization 
T | 2% {h><ce4a) V Vito Vac 1B €N} 
Te X2 {(b21 A WSj,) V diag??? (k +2) : k € N} 
Ta 22 {Pen} U Ya) V vV vZ Vaik € N} 
Tim | X (UR A VSR) V (hin A PE aay) V Vice (bm AY) ik EN} 
airs {Wenn AYS) VVE Oa AVE ag) REN 
TP | Le [{(ba2 AVY z. (2) V ((Wac(e+2) V Vitz Pac) AY z. p(2)) : k € N} 
T| Zs {a1 V ((b>ccn42) V Vito Pact) A Yz. =p(x)) : k € N} 
TY | Ee | Cv} UL (beet nYa) V VE (bai A YE ay) k EN} 
TO Pp {WSK42 > Yie) | :k €N} 
TÈ xy {W A VS) V (Wr > Vkit) 1 k EN} 
T| 32 {ov} U LU AVS) V HEr > Van) ik EN} 
h E {(W2Z1 A YZ,) V (diag??? (k +2) A Yz. =p(x)) : k € N} 
TOP] Ds {Vi} U {02 AYS) V diag??? (k +2) : k € N} 


Fig.8. Busy Beaver Theories for Table2. diag??? (k +2) stands, for each k € N, for 


(VSee) ^ Vik , 2)) V VZ ci ^ pw) and p(x) for s(x) = x; in Tf, n, We assume 
wlg.m >n. 


binations, it seems that more complex theories are needed. For this purpose, we utilize 
the well-known Busy Beaver function, and define various theories based on it. In this 
section, we describe these theories. First, in Sect. 4.4.1, we review the Busy Beaver 
function, and explain why it is useful in our context. Then, in Sects. 4.4.2 to 4.4.6, we 
describe the theories that make use of it, separated according to their signatures. 


4.4.1 On the Busy Beaver Function The Busy Beaver function, here denoted ç, is 
an old acquaintance of theoretical computer scientists: essentially, given any n € N, 
s(n) is the maximum number of 1’s a Turing machine with at most n states can write to 
it’s tape when it halts, if the tape is initialized to be all 0’s. Somewhat confusingly, any 
Turing machine that achieves that number is also called a Busy Beaver. 

It is possible to prove that s(n) € N for any n € N (see [5]), and so we may write 
¢:N—N; furthermore, ç is increasing. But the very desirable property of ¢ is that it is 
not only increasing, but actually very rapidly increasing. 

More formally, Radó proved, in the seminal paper [5], that ¢ grows asymptotically 
faster than any computable function (being, therefore, non-computable). That is, for 
every computable function f : N — N, there exists N € N such that ¢(n) > f(n) for 
all n > N. Despite that, the Busy Beaver starts somewhat slowly: ¢(0) = 0, (1) = 1, 
¢(2) = 4, ¢(3) = 6 and ¢(4) = 13; the exact value of ¢(5) (and actually ¢(n) for any 
n > 5) is not known, but is at least 4098 [3]. 
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The fact that ¢ grows eventually faster than any computable function is a great prop- 
erty to have when constructing theories that admit the finite model property, while not 
being finitely witnessable. Roughly speaking, if the cardinalities of models of a the- 
ory are related to ¢, this guarantees that it has models of sufficiently large finite size, 
while not being finitely witnessable since its models grow too fast: by carefully choos- 
ing formulas ¢,, that hold only in the ’n-th model” of the theory (when ordered by 
cardinality), the number of variables of wit(@,,) offers an upper bound to ç(n) and is 
therefore not computable, leading to a contradiction with the fact that wit is supposed to 
be computable. Notice that, despite the dependency of our theories on the Busy Beaver, 
the function is not actually part of their signatures. 

Now we present the theories that are based on ¢. These theories are axiomatized in 
Fig. 8. 


4.4.2 A &-Theory The most basic Busy Beaver theory is 7<. This is the 47; -theory 
whose models have cardinality ¢(k), for some k > 2, or are infinite: that is, J. has 
models with 4 elements, 6, 13 and so on. This theory forms the basis to all other theories 
of this section, that are designed to admit various properties from Table 2. 

By itself, X< has the finite model property while not being (strongly) finitely wit- 
nessable. It was in fact constructed precisely to exhibit this. As it turns out, it is also 
not smooth, but does satisfy all other properties. To populate other rows in the table that 
correspond to theories with other combinations of properties, more theories are needed, 
with richer signatures. 


4.4.3 32-Theories To fill the rows that correspond to other combinations, we intro- 
duce several Xs theories. 

The 3/9-theory 7°° is more complex. It has, enren, three classes of models: 
the first is made up of structures A where || = 1 and gå is infinite; the second, of 
structures where both o^ and c% are infinite; and the third, of structures where |o4| = 
|o3'| is a finite value that equals ¢(k), for some k > 2. The formula diag??? (k + 2), 
for k > 2, in the axiomatization equals (YS (k42) ^ VSZ 42) V Vet? (we Zi VS) 
and is similar to diag”? (k + 2) nom TS; O g the models A where either 
|oA| = |os!| = s(k + 2), or both o^ and + are infinite. 

For each n > 0, 7,5 has as interpretations those A with |o“| = n, and |o+*|either 
infinite or equal to ¢(k), for some k > 2 (so (|o%], |o3"|) may equal (n, 4), (n,6), 
(nm, 13) and so on). 

Thn 18 a Xə-theory that can be seen as some sort of combination of 7,7, 
J aS dependent on two distinct positive integers m and n. Consider the case where a 
former is the greater of the two (the other cases are similar). In this ease, we may divide 
its inne anon A into three classes: those with |o+| = n and a£ infinite; those with 
|oA| = m and g% infinite; and those with |o4| = m and |c3"| equal to some ¢(k), for 
k>2. 


4.4.4 &,-Theories For some lines of Table 2, e.g. line 7, empty signatures are not 
enough for presenting examples. Hence we also introduce »’,-theories. 

We start with Z£, which is, arguably, the most confusing theory we here define: 
we are forced to appeal not only to the special cardinality formulas found in Fig. 6, but 


172 G. V. Toledo et al. 


also to the function ¢~!, which is a left inverse of ç. More formally, st:N—ON 


is the only function such that ¢~'(k) = min{l : ç(l + 1) > k}: so ¢71(0) = 0, 
oh) = $7} (2) = 578) = 1,974) = 575) = 2,076) = + = ~} (12) = 3, 
s_+(13) = --- = ¢~1(4097) = 4, and further values of ¢~! are currently unknown. 
From the definition of ¢~+, we have that ¢(¢~*(k)) < k and ¢~1(c¢(k)) = k. 74 
is not computable given that, since ¢~'(k) = min{l : ¢(1 + 1) > k} by definition, 
sT1(k +1) #£¢71(k) iff k+1 is a value of ç: so, an algorithm to compute the values of 
ç could be obtained by simply computing the values of ¢~! and checking where there 
is a change. 

TE is then the 5’,-theory with models A with any cardinality k + 1 > 1, such that 
s4(a) = a holds for precisely ¢~'(k + 1) elements of A, and so s4(a) 4 a holds 
for k +1—¢—1(k + 1) elements, being the function k + k +1—¢~1(k +1) itself 
non-decreasing, given that ¢~!(k + 1) can equal either ¢~!(k) or s~'(k) +1. 


Example 2. We mention some T.3-structures as examples: a structure A with |o“| = 1 
and s^ the identity; a structure B with |o8| = 2 and s^ a constant function; a struc- 
ture C with |of| = 3 (say of = {a,b,c}) and s© the identity for only one of these 
elements (e.g., s© can be a constant function, but now there are further possibilities 
such as s“(a) = s(b) = a and s©(c) = b); and a structure D with |o?| = 4 
(say oP = {a,b,c,d}) and s? the identity for only two of these elements (e.g., 
s? (a) = sP (b) = s? (c) = a and 5s? (d) = d); 


Next, we continue to describe other X, theories. 

TŽ has essentially two classes of models A: those with |o+| = 2 and s^ never 
the identity; and those with |o“| equal to ¢(k) or infinite, for some k > 2, and s^ the 
identity. 

ae is very similar to TŽ: the difference lies on where s will be the identity: while 
in J the function s is the identity for all interpretations A with |o^] > 2, s in TA is 
the identity only for the interpretations A with |o4| = 1. So, in T , we have a model 
A with |o^| = 1 and s^ the identity, and then models A with |o^4| = ¢(k) for some 
k > 2 or infinite, and s^ (a) anything but a. 

The 5/,-theory J.” is then just 7, satisfying in addition the formula yy (see Fig. 6). 
It has models A of any finite cardinality k + 1, as long as ṣs7!(k + 1) of these elements 
a satisfy s^ (a) = a, or infinite cardinalities, as long as the number of elements a satis- 
fying s4 (a) = a is infinite; additionally, s4 (s^ (a)) must always equal either s4 (a) or 
a itself. 


4.4.5 X2-Theories Now for theories in a many-sorted non-empty signature. 

The X2-theory T= appears simple, but is actually quite tricky: starting by the easy 
case, if o^ has infinitely many elements a satisfying s4(a) = a, c% is also infinite. 
If, however, the number of elements a € o^ satisfying sA(a) = a is finite (notice 
that, even if this is the case, o^ may still be infinite) and equal to some k + 2, then 
ay has at least ¢(k + 2) elements. So, to give a better example, suppose o4 has 2 
elements satisfying s4 (a) = a: then g/t has at least ¢(2) = 4 elements, but may have 
any cardinality up to, and including, infinite ones; notice that in this example o^ may 
be infinite as well, as long as only two of the elements satisfy s4(a) = a. 
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T is the same as T=, but with extra models A where |o“| = 1 and |o3| > w (of 
course, then we have that s^ is the identity). 

Ty is then the same as ae with the added validity of the formula wy; So the models 
of TẸ are just models of 7? satisfying that s4(s“(a)) equals either s4(a) or a itself. 

T is just the X3-theory 7°° with the added function s such that, if lo" | = 1, s4 
is the identity; and if |o4| > 1, s4 (a) is anything but a. 


4.4.6 A X3-Theory Finally, 7° is obtained by adding a sort with a single element 
to the X2-theory 7°°, similarly to the definition of T;}3, that was based on the 4/2-theory 
Ta 3 (see Sect. 4.3). 


4.5 Theory Operators 


There are two types of theories in Table 2: The first consists of base theories, such as 
T>n, that are axiomatized in Figs.5, 7 and 8. The second is obtained from the first, 
by applying several operators on theories. For example, the theories (T>;,)?, (T>n)s, 
((Zsn)”)s, are all obtained from the base theory T>n. So far we have only described 
the theories of the first type. In this section we explain the theories of the second type. 

The operators that are used in Table 2 were defined in [10], in order to be able 
to systematically generate examples in various signatures. For example, if 7 is a X1- 
theory, then (T jg is a X3-theory with the same axiomatization as 7, that is, the second 
sort is completely free and is not axiomatized in any way. For completeness sake, we 
include the definitions of these operators here: 


Definition 1 (Theory Operators from [10]) 


1. IfT is a X-theory, then (T)? is the Xz-theory axiomatized by Ax(T). 

2. Let X, be an empty signature with sorts S = {01,..., 0n}, and let T be a Xn- 
theory. The signature X} has sorts S and a single unary function symbol s of arity 
cı > c, and (T), is the X} -theory axiomatized by Ax(T) U {V a. [s(a) = a}}, 
where x is a variable of sort o1. 

3. Let T be a theory over an empty signature with sorts S = {01, . . . , on }. Then (T)v 
is the X? -theory axiomatized by Ax(T) U {tv } (see Fig. 6). 


It was proven in [10] that these operators preserve the properties SI, SM, FW, SW, 
CV, and the lack of them. Here we prove that the same holds for FM and SF as well. 


Theorem 8. Let T be a \-theory. Then: T is FM, or SF, w.rt. {o} if and only if (T)? 
is, respectively, FM, or SF w.rt. {0,02}. 


Theorem 9. [fT is a theory over an empty signature Xn with sorts S = {01,..., On}, 
then: T is FM, or SF, w.rt. S if and only if (T), is, respectively, FM, or SF, w.rt. S. 


Theorem 10. Jf T is a theory over an empty signature X, with sorts S = 
{o1,...,On}, then: T is FM, or SF, w.rt. S if and only if (T)y is, respectively, FM, 
or SF, wrt. S. 
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Thus, in various cases, theories need not be invented from scratch, but can be gen- 
erated from other theories. For example, the theory 75n exhibits all studied properties, 
but is defined in a one-sorted signature. Using the operators, we obtain variants of this 
theory in all signature types, namely (Tsn)? for empty many-sorted signatures, (T>,,) 
for non-empty one-sorted signatures, and ((T>n)?)s for non-empty many-sorted signa- 
tures. The properties of the theories generated using these operators are guaranteed by 
Theorems 8 and 9, as well as the corresponding results from [10]. 

In two cases of theories defined using the Busy Beaver function, J” and ZĘ, we 
cannot obtain them by relying on Theorem 10 from, respectively, 7° and 75, since the 
signatures of the latter theories are not empty. Curiously, adding yy to their axiomati- 
zations still has the desirable outcome, but we prove this separately, without relying on 
Theorem 10. Extending Theorem 10 to non-empty signatures is left for future work. 

The number of combinations of properties that we consider, together with the pos- 
sible types of the signatures, adds up to 2? = 512. Our negative results from Sect. 3 
guarantee that only ~15% of the actual table can be filled with examples. The remain- 
ing ~85% are either shaded or are excluded from the table for space considerations. As 
for the examples that can be given, notice that there are in total an astonishing number 
of 78 theories in our table. But, thanks to the theory operators of Definition 1, only 33 
of them (~42%) had to be concretely axiomatized in Figs. 5, 7 and 8. The remaining 45 
theories were defined using the operators. 


5 Conclusion 


We examined, in addition to all properties considered in [10], the finite model prop- 
erty, and stable finiteness. Interesting restrictions for the combinations involving these 
properties were established. We also found interesting theories to fill in our table of 
combinations, most prominently those involving the Busy Beaver function as well as 
its inverse. 

One possible direction this research could take is reasonably clear: considering the 
computability of the mincard function, what will, most probably, double the number 
of theories to be taken into consideration. Further interesting properties that could be 
considered include the decidability of the theory’s axiomatization, or even its finiteness, 
and the satisfiability problem of the theory with respect to quantifier-free formulas. 

Second, some of the negative results in [10] and in the present paper only hold with 
respect to the entire set of sorts Ss. We plan to study if they hold also with respect to 
proper subsets of sorts, and if they do not, to provide counterexamples to those gener- 
alizations. 
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Abstract. When a proof system checks a formal proof, we can say that 
its kernel asserts that the formula is a theorem in a particular logic. 
We describe a general framework in which such assertions can be made 
globally available so that any other proof assistant willing to trust the 
assertion’s creator can use that assertion without rechecking any associ- 
ated formal proof. This framework, called DAMF, is heterogeneous and 
allows each participant to decide which tools and operators they are will- 
ing to trust in order to accept external assertions. This framework can 
also be integrated into existing proof systems by making minor changes 
to the input and output subsystems of the prover. DAMF achieves a high 
level of distributivity using such off-the-shelf technologies as IPFS, IPLD, 
and public key cryptography. We illustrate the framework by describing 
an implemented tool for validating and publishing assertion objects and 
a modified version of the Abella theorem prover that can use and publish 
such assertions. 


1 Introduction 


In order to communicate a result from one formal reasoning system to another, a 
common technique is to transfer a formal proof certificate from the source system 
to the target system. This technique is usually required when the target system 
is autarkic,! wherein the system only trusts its own components, of which a par- 
ticularly trusted component is an implementation of a proof checking kernel. To 
transfer a formal proof to an autarkic target system, either (a) the proof has to 
be translated from the source system, or (b) the verifier for the proof must be 
re-implemented as a certified procedure in the target system [6,25]. Both kinds 
of transferal are complicated for a variety of reasons: (1) The source and tar- 
get system may not be syntactically, semantically, or foundationally compatible. 
(2) The source-proof language can have complex operational semantics that is 
cumbersome to encode in the target system. (Note that no universal standard 
has yet emerged for encoding the formal semantics of arbitrary proof languages; 
cf. Sect. 5.) (3) As systems change and mature, older versions of proof certifi- 
cates can become stale and unmaintained. (4) Perhaps most importantly, many 


1 In [12], the adjective autarkic was applied to computational components of a proof 
checker but not to an entire proof checker. 
© The Author(s) 2023 
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popular reasoning systems do not produce proof certificates at all. Prominent 
examples of that latter are SMT solvers that are not certifying when memory 
size and execution time are critical [32] and the specification tool Twelf [42] 
when using non-certifying procedures (e.g., totality checking). 

Formal reasoning systems that are non-autarkic have an additional way to 
interact with external provers that addresses many of the above issues. In such 
systems, a host system is designed to build proof obligations that are then dis- 
patched to external systems to solve. While these external systems may produce 
proofs, the host system usually does not check the proofs and instead trusts 
the executions of the external systems. This system architecture is most com- 
monly used in program verification tools such as Dafny [28], Why3 [24], and 
TLAPS [16]. One issue not addressed with this enlarged view of trust is that the 
external dependencies tend to have unclear descriptions, especially from a third- 
party perspective. To illustrate, Dafny may declare that it trusts “Z3 v.4.12.1”, 
but what does this mean? Is this external dependency to be interpreted by name, 
in which case any tool called “Z3 v.4.12.1” can be used, or is it precisely identi- 
fied by, e.g., (a cryptographic hash of) the source code (or better, an executable 
binary) of a particular tool called “Z3 v.4.12.1”? Even with a precise identifica- 
tion, an external executable dependency may not be practical to incorporate. For 
example, the HOL Light system [27] re-checks its entire standard library every 
time it is started, taking on the order of minutes. If a development involves many 
calls to an external HOL Light-based solver, how are the calls to be orchestrated? 

In addition to these two bases of trust—autarkic based on proof certificates, 
and non-autarkic based on executions of external tools—there is at least one 
other basis of trust in any heterogeneous development: the agents that write 
and assemble the developments and execute the formal tools as required (check- 
ers, solvers, etc.). An example of an agent is a user, although one individual 
user can have many agent profiles (see Sect. 3.2). Entities such as a trustwor- 
thy central database can also correspond to an agent. Trusted agents have been 
largely neglected in the formal reasoning world, but they are common in other 
high reliability settings, such as security. Nevertheless, agents are at least implic- 
itly present in any formal development: to claim that a result has been formally 
achieved is tantamount to saying that some trustworthy agent (e.g., peer review- 
ers) has correctly and successfully executed a specific collection of formal tools 
to convince themselves of that formal result. Furthermore, if one agent A trusts 
another B, there is no need for A to re-check B’s proof scripts and re-execute 
any tools that B used to construct the result. 

In this paper, we propose a framework where a distributed collection of agents 
can exchange formal results (called assertions), where the results have an unim- 
peachable provenance, and where each agent is in full control of their trust 
parameters. This Distributed Assertion Management Framework (DAMF) is: 


— Decentralized: a global notion of truth is not imposed on every participant 
by means of a privileged logic, language, system, or software. This linguistic 
independence makes DAMF different from formalisms such as the evidential 
tool bus [20,38] that have been proposed for integrating external reasoning 
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agents into a unified formal system. Participants in DAMF are free to com- 
bine assertions from different sources if they believe the combination to be 
meaningful. Any participant can retrieve and use any assertion they under- 
stand, and this external import will be explicitly marked as a dependency if 
they choose to publish assertions they build with such external imports. 

— Reliable: assertions have an irrefutable provenance, i.e., the fact that an agent 
has published an assertion is locally verifiable and independent of any other 
aspect of DAMF. Assertions, therefore, need to be immutably and eternally 
available, even in the presence of intermittent infrastructure and nefarious 
users or tools. 

— Composable: assertions are not rigidly constrained by their history; new log- 
ical artifacts such as theories, libraries, proof outlines, etc. can be crafted by 
reorganizing existing assertions based on their declared dependencies. 

— Egalitarian: the barrier to entry is low for participants who want to produce 
or consume such assertions. 

— Status Quo Compatible: existing work already done with current mainstream 
systems is readily incorporated as assertions without needing to modify any 
existing system. 


Concretely, DAMF provides JSON-based representations of a small num- 
ber of concepts such as formulas, assertions, dependencies, etc. without any up- 
front commitment to a formal syntax or any particular semantics. These objects 
are then added to a global store in terms of the InterPlanetary File System 
(IPFS) [13] using linked data in the InterPlanetary Linked Data (IPLD) format. 
An object in IPFS/IPLD is denoted by a canonical content identifier (cid), a 
cryptographic hash of its content. Knowing the cid is sufficient to retrieve the 
object by any participant of the IPFS network. Furthermore, the cids are the 
only externally visible names in DAMF, and links between objects are made 
using these cids by IPLD. Features specific to a particular language or system, 
such as constants, variables, definitions, and notations, are kept localized to par- 
ticular formula objects. Assertions are built using (the cids of) formula objects 
and signed by their creator agents using public key cryptography. IPFS is used to 
distribute DAMF objects transparently using various technologies whose precise 
details are irrelevant to this paper. 

This paper is accompanied by two concrete implementations that illustrate 
DAMF. First, we provide a tool called Dispatch that can be used by users and 
systems to both produce and consume DAMF assertions. Dispatch is not a privi- 
leged tool in DAMF: users and systems can interact directly with DAMF objects 
in IPFS if they so choose. Dispatch is simply one interface to the DAMF global 
store, making the integration of producers and consumers minimally demanding. 
It does tasks such as schematically validating the concrete JSON objects added 
to or retrieved from the global store. Dispatch also helps to analyze and modify 
the trust parameters for (compositions of) assertions. 

Second, we implement a version of the Abella interactive theorem prover [10] 
that can produce and consume assertions in DAMF, mediated by Dispatch. As 
an example of its use, we show how Abella can use a lemma that was stated 
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and proved using the automated linear arithmetic reasoning tactics of Coq (v. 
8.16.1); this lemma is manually translated from the Coq to the Abella language, 
with an explicit dependency on its Coq development, and added to the global 
store by the present authors. A user can accept this heterogeneous development 
as long as they trust Coq, Abella, and our translation of the Coq lemma to Abella. 
Moreover, this assertion, which contains explicit links to the externally sourced 
DAMF imports, can be published back to DAMF for use by others. 

Since dependencies are explicitly tracked in DAMF assertions, any user can 
analyze various aspects of how it was composed of other assertions. Such analysis 
can form the basis of various kinds of investigations: for example, if a formula is 
found to be a non-theorem, an investigator can explore the compositions of the 
DAME assertions that yield that formula in order to find the agents whose trust 
parameters may need to be modified. The Dispatch tool mentioned above comes 
with a command called lookup that explores combinations of known assertions 
that ultimately yield a desired result; for each such composition, the analysis 
extracts the collection of agents (and tools) that could be trusted in order to 
accept that composition. 

In the next section, we describe the abstract design of DAMF and its underly- 
ing logic of assertions which form the basis of the abovementioned investigations. 
Section 3 describes our concrete implementation of DAMF, Sect. 4 discusses some 
of the design choices in DAMF, and Sect. 5 discusses some related work. The spe- 
cific software tools (Dispatch and Abella-7DAMF) accompanying this paper are 
fully documented at https://distributed-assertions.github.io/. 


2 Design of DAMF 


2.1 Languages, Contexts, and Formulas 


To transfer a theorem from a source proof system to a target proof system, we 
must be able to transfer the statement of the theorem, which we represent as a 
formula object in DAMF. To be as general as possible, we represent the content 
of such a formula as a string, i.e., in a format suitable as an input to a parser of 
the source proof system. In order to determine that the input is well-formed, the 
source proof system may need further information about the featuwres—symbols, 
predicates, functions, types, notations, hints, etc.—used in the formula. Such 
additional information is the context of the formula, which we represent as a 
document fragment in the language of the source proof system. 
For example, take the following theorem written in Coq 8.16.1: 


1 Definition lincomb (n j k : nat) := exists xy, n=x*jt+y*k. 
2 Theorem ex_coq : forall n:nat, 8 <= n -> lincomb n 3 5. 


The formula corresponding to the theorem ex_coq is the literal string “forall 
n:nat, --- lincomb n 3 5”. The symbols 8, <=, etc. are part of the standard 
prelude of this language, and the symbol lincomb is defined in line 1, so a sufficient 
context necessary for Coq 8.16.1 to parse and type-check the theorem statement 
is the text of line 1, which is also written in the Coq 8.16.1 language. 
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Abstractly, a formula object in DAMF is a triple (L, X, F) where L denotes 
a language, X denotes a context, and F denotes a formula, all of which may 
conceptually be thought of as strings. We will use the schematic variable N 
to range over such formula objects. The language L is a canonical identifier 
(specifically, the cid of a DAMF language object) which may optionally represent 
information about a suitable loader for the language that will make sense of the 
strings X and F; DAMF compares languages just by their identifiers. Moreover, 
L is interpreted as defining all the globally available features; for instance, the 
symbol nat is part of the standard prelude of this version of Coq and should 
therefore be understood as being defined in the language Coq 8.16.1. The context 
X introduces any user-defined features such as the definition lincomb above that 
is not part of Coq’s standard prelude. 

Note that DAMF formula objects are considered to be closed, i.e., every 
symbol used in the formula is defined in the language or the context. From 
the perspective of DAMF, a formula object is an atomic entity. Additionally, 
DAMF does not need to be aware of any reasoning principles of the language 
or context components. For instance, no mechanism in DAMF would allow the 
substitution of a declared symbol in the context with a concrete definition. The 
purpose of differentiating a formula object into three parts is purely pragmatic: 
the language part will in most cases be a well known object used by many agents, 
and the context part may potentially be shared between multiple assertions. 
DAMF consumers may be able to use this sharing of information to consolidate 
tasks such as context-processing. 


2.2 Sequents and Assertions 


A sequent in DAMF is abstractly of the form N1,..., Ng F No where each of 
the N; is a DAMF formula object defined in the previous subsection. We will 
use the schematic variable I’ to range over ordered lists of formula objects, and 
S to range over sequents. In a sequent I’ N, we say that N is the conclusion 
and I’ are the dependencies. Such sequent objects may be produced whenever 
a formal proof has been checked in a proof checker: the conclusion represents 
the statement of the theorem, and the dependencies are external lemmas that 
were used during that proof. As an example, suppose the Coq 8.16.1 theorem in 
Sect. 2.1 has a proof that appeals to the lemma lem : forall mn, m<=n -> S m 
<=n \/ m = n. The sequent that is produced is conceptually of the form lem F 
ex_coq, though concretely we would have to build DAMF formula objects by 
packaging the language and contexts. 

An agent is a globally unique name. We use the schematic variable K to 
range over agents. We define a simple multi-sorted first-order logic where agents 
and sequents are primitive sorts and where the infix predicate says is the sole 
predicate; the atomic formula K says S, where K is an agent and S a sequent, 
is an assertion. The says predicate is implemented in DAMF using public-key 
cryptography. In a DAMF-aware proof system, when an appeal is made—say as 
part of the proof of some other theorem—to an assertion K says (N1, ..., Ng F 
No), the appeal is interpreted as follows: 
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— The agent K is treated as trusted; if the agent cannot be trusted for some 
reason, such as if K occurs in a deny list, then the assertion is unusable. 

— The conclusion of the assertion, No, contains the formula representing the 
lemma that is being appealed to. Note, in particular, that the dependencies 
N,,...,Nz are not relevant to appealing to this assertion as an external 
dependency. These dependencies will be used in reasoning about compositions 
in DAMF, as described in Sect. 2.4. 


2.3 Adapters 


Because every formula object packages the formula together with its context and 
language identifier, every formula object is independent of every other formula 
object. Thus, in a sequent N: F No, there is no requirement that the conclusion 
No and the dependency N: be in the same language or have a common context. 
When working within a single autarkic system (e.g., a proof checker using a single 
logic), the sequents that are generated for every theorem will probably place the 
conclusion and dependencies in the same language and context; however, in the 
wider non-autarkic world, we can use multilingual sequents as first class entities 
that are documented and tracked the same way as any other kind of sequent. 

An important class of multilingual sequents comes from adapters. In order 
for a theorem written in the Coq 8.16.1 language to be used by a different 
system with a different language, say Abella 2.0.9, we will need to transform 
the formula objects in the former language to those in the latter language. This 
kind of translation is an example of a language adapter, which falls into the 
general class of adapters, and which creates a sequent by translating between 
languages or modifying the logical context by standard logical operations such 
as weakening (adding extra symbols), instantiation (replacing a symbol by a 
term), or unfolding (replacing a defined symbol by its definition). 

As an example, the Coq 8.16.1 example above can be translated to the 
Abella 2.0.9 language as follows, where the function symbols + and * are 
replaced by relations in Abella.” 


1 Import "nats". % some natural numbers library 

2 Define lincomb : nat -> nat -> nat -> prop by 

3 lincomb N J K := exists X YU V, 

4 times X J U /\ times Y K V /\ plus U V N. 

5 Theorem ex_ab : forall n, nat n -> le 8 n -> lincomb n 3 5. 


Lines 1—4 determine the context Xex_ab for the formula ex_ab on line 5. 
The sequent that represents this translation therefore has the form 


(Coq 8.16.1, Nex_cog, ex_coq) H (Abella 2.0.9, Nex ab, ex_ab). 


Suppose agent Kı signs this translation and that agent Kə signs the sequent 
H (Coq 8.16.1, Nex_coq; ex_coq). As long as Kı and Kə are trusted by the user 
of Abella 2.0.9, then the formula object (Abella 2.0.9, Yex ab, ex_ab) can also 
be treated as a theorem by that user thanks to composition, discussed next. 


? This encoding of functions using relations is the usual one: see [17] for details. 
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2.4 Composing Assertions, Trust 


Assertions will be composed by means of a single rule of inference that imple- 
ments a cut-like rule for sequents, COMPOSE. 


K says (I, F M) K says (M, Ia F N) 


K says (Iı, I2 F N) COMPOSE 


The effect of this rule means that the says predicate does not correspond one- 
to-one with cryptographic signatures. The conclusion of the COMPOSE rule may, 
in particular, not be a sequent that has been explicitly signed by the agent 
K even if both premises are. Rather, the rule states that whenever K can be 
said to reliably claim, either by a cryptographic signature or by a COMPOSE- 
derivation tree, that both I; F M and M,I> F N, then K must also reliably 
claim I, I> FN. 

There are many variations to access control logic in the literature. For exam- 
ple, some such logics use inference rules such as: 


TEN K says (T = N) 
m or 
K says (I } N) K says (K says (T - N)) 


Such rules are neither syntactically well-formed nor desirable for our purposes. 
We use here a very weak access control logic (see [1] for a survey of such logics). 
Instead, checking the validity of a given derivation using COMPOSE is compu- 
tationally trivial: each instance of it must eliminate exactly the leftmost depen- 
dency in the second premise, which is a DAMF formula object that is compared 
by cid. 

Observe that the agent K does not participate in a meaningful way in a 
derivation that is built with the COMPOSE rule. Thus, for a given end sequent 
of the form K says (F N), a COMPOSE derivation can be seen as a proof outline 
for the desired theorem N, with the leaves of the derivation being the assertions 
that need to be sourced from an assertion database (such as the DAMF global 
store). We say that an assertion (K says S) is published if it can be retrieved 
from such a database. The inference system is then enlarged with the following 
rule that can be used to complete the open leaves of the COMPOSE derivation 
using assertions made by different agents. 


(Kı says S) is published 


ale K K 
Kə says S BURT [Krka] 


This rule is parameterized by a pair of agents, Kı and Ke, and is understood 
to be applicable only when K; is in the user-specified allow list of Ko (i.e., Ky 
speaks for Kə, which we write as [K1 > K9)). 

We do not assume that agents have any additional closure properties beyond 
COMPOSE and TRUST. For example, suppose NA, Na—p, and Np are the for- 
mula objects that correspond to the formulas A, A — B, and B respectively in 
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some language. We do not assume that the following rule is admissible: 
K says (C.F Nap) K says (T H Na) 


MP. 
K says (I F Np) 


That is, we do not assume that the formulas asserted by agent K are closed under 
modus ponens. Similarly, we do not assume that what agents assert are closed 
by substitution or instantiation of any symbols that are defined in the contexts 
of the formula objects. While a particular agent may not be closed under modus 
ponens, substitution, or instantiation, it is possible to employ other agents that 
can look for opportunities to apply such inference rules on the results of trusted 
agents. In particular, if we want the query engine to be able to use the MP rule, 
then the engine must construct an agent Kup whose sole function is to generate 
assertions such as Kumr says (Na+e,Nat Np) that correspond to applications 
of the MP rule. Of course, Kup will need to be in the allow list for any agent 
wanting to use this agent. 


2.5 Producing Assertions, Formal Reasoning Tools 


Conceptually, an agent constructs a DAMF sequent as a consequence of running 
formal reasoning tools such as proof checkers or theorem provers. DAMF includes 
tool objects, which are unconstrained JSON objects that can be used to describe 
such tools. A tool object does not necessarily describe an implemented tool; it 
might describe a part of it, or an abstract description of the logical system in 
which the sequent is asserted in, for instance. Like with languages in Sect. 2.1, 
we compare tools for equality by means of the cids of these tool objects. It is 
also possible for an agent to build a DAMF sequent manually, without running 
any tool. The agent may do this for a number of reasons: e.g., the assertion may 
be a conjecture (i.e., a proof may be provided at some other time but is currently 
missing) or a manually produced adapter. 

A DAMF production is a sequent that is annotated with a mode that 
describes how the sequent was produced; this mode can be the cid of a tool 
object mentioned above, or it can be null expressing an unproven sequent. We 
use the schematic variable T for modes, and write a production of the sequent 
T tN with mode T as Il Hr N. Published DAMF assertions will be of the form 
K says (T Fr N), and we modify the TRUST rule to the following: 


(Kı says (I Fr N)) is published 


TRUST |Kky/T > K: 
Kə says (T H N) poe 


where the side condition [K/T — K2] means that Kə allows Ky’s assertions 
in mode T. It may be tempting to think of K,/T as an agent by itself, but, as 
we shall see in Sect. 3.1, agents are implemented in DAMF using keypairs, so if 
K,/T, and K,/T> were separate agents then there would be no verifiable way to 
link them both to Kı. This use of modes makes it possible, for example, to trust 
an agent K using any version of Coq while not trusting K when using other 
proof systems. 
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2.6 Logical Consistency of Heterogeneous Combinations 


DAMF imposes no constraints on the composition of assertions, which can at 
first glance appear to be risky. For example, suppose the assertions come from 
incompatible logics, say an assertion in classical logic during the proof of an intu- 
itionistic theorem. Without exceptional care, the result of a COMPOSE will only 
be classically, not intuitionistically, true. Similar problems exist if the imported 
assertion requires additional axioms that are incompatible with the user’s setting 
(e.g. extensionality or UIP in the setting of univalence). 

This issue highlights the fact that DAMF does not guarantee logical compat- 
ibility of assertions; rather, DAMF is more accurately seen as a record of com- 
positions that have been made. To trust an agent’s assertion is just to say that 
we trust that the agent indeed had good reasons (such as a proof) to make that 
assertion, not that the assertion may be arbitrarily composed. Moreover, DAMF 
assertions are intended to be read as hypothetical statements from dependencies 
to conclusions (where “hypothetical” is understood in the informal language of 
discourse rather than as a formal implication or entailment). If the dependencies 
cannot be met, the assertion is useless. To illustrate, if an agent K wants to 
use an assertion J’ + M in their proof of N, the assertion they will publish is 
K says (M F N), which is acceptable in isolation; if M is incompatible with the 
logic of N, then the assertion K says (M+ N) is vacuous. 


3 Implementation: Information, Processes, and Tools 


3.1 The Structures of the Global Store 


A crucial design criterion of DAMF is that the assertions and their constituent 
objects are a globally shared commodity, existing independently of the tools 
that produce or consume them. To this end, DAMF requires well-defined basic 
structures that producers would produce and consumers would expect and know 
how to address. 

The use of a content-addressing scheme is an essential part of seeing these 
structures as global. Each structure is identified and addressed by a unique 
global identifier in a common namespace in an independently verifiable and 
trusted way: the identifier is derived from the content itself and every alteration 
of the content produces a new identifier; at the DAMF level, the content is 
the name/address, and comparing two objects structurally at the DAMF level is 
reduced to comparing their cids as strings. One way to handle differences in cids 
between different forms of conceptually the same DAMF object is by curation 
and normalization of such structures at the level of producers or potentially 
other DAMF actors. 

The structures we may want to specify in DAMF are built by composing sev- 
eral elements; for instance, a sequent contains formula structures, which them- 
selves contain context structures. In DAMF, we make the design choice to treat 
all such structures as first class objects stored in a distributed network through 
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IPFS, and use the linked data representation of IPLD to represent an object as 
being composed of other objects. 

The core DAMF structures we define are context, formula, sequent, produc- 
tion, and assertion. Concretely, these structures are represented as JSON objects 
with a varying format property which has the type of the structure as its value. 
These structures are described as follows (full definitions in [4, Appendix AJ): 


— Context: contains a language field, which is an IPLD link to a language object, 
described in Sect. 2.1, and a content field containing the body of the context. 

— Formula: contains a language field, a content field for a string representation 
of the formula in the language, and a context field that is an IPLD link toa 
context object, as described in Sect. 2.1. 

— Sequent: a dependencies field mapped to a list of IPLD links to formula 
objects, and a conclusion field as an IPLD link to a formula object. 

— Production: pairs a sequent object with a mode field denoting a mode of pro- 
duction of a sequent as described in Sect. 2.5. 

— Assertion: a claim field mapped to an IPLD link to a production (currently 
considered the main claim type in DAMF), an agent field mapped to a public 
key, and a signature field containing the result of signing the cid of the value 
of the claim field. 


Given these schemata, the aspects of tracking and trusting become natural: a 
formula present as a dependency in some assertion could be matched with the 
same formula present as the conclusion of a different assertion. 

It is also useful to annotate these core DAMF objects with additional meta- 
data such as external names, proof objects, timestamps, etc. In DAMF, we have 
chosen to give the core objects a cid independent of the metadata; instead, for 
every core object, we define an annotated object that is composed of a link to 
the core object and a link to any additional metadata. DAMF follows the design 
principle that objects are to be considered equal at the DAMF level if they have 
the same cid: the content of the objects is not examined, and no IPLD-links 
are followed for such comparisons. Generally speaking, therefore, DAMF core 
objects will not link to annotated objects, since the annotations will factor into 
the cids and force disequality when undesired, such as when building composi- 
tions (Sect. 2.4). The sole exception to this rule of thumb are assertion objects 
which can use annotated production objects as their claims. Note that every 
assertion object will be globally unique when produced: it will have a different 
cid each time its claim is signed, even if signed by the same agent, because 
cryptographic signatures always include a nonce. 

Another layer of structures that can aggregate global object references are 
collections. We currently define one generic collection format in our implemen- 
tation: many other non-generic collection formats can easily be considered. 


3.2 Processes in DAMF, and Dispatch as an Intermediary Tool 


The two obvious processes in DAMF are the production and consumption of 
DAMF objects. In a production process, DAMF objects are constructed starting 
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from local information, published, and then stored across the distributed net- 
work. The consumption process is in the opposite direction: locally consumable 
information are constructed from DAMF objects. The important point is that 
these DAMF objects are common and well-understood (as DAMF formats) for 
all consumers, and each consumer decides what to consume and how to consume 
it. For example, a consumer might only choose to read formulas that are of some 
specific language, and then decide how to process their internal structures based 
on its own criteria. Other than these two, other processes will be done on the 
published DAMF objects that will incorporate their combination, curation, and 
analysis. The process we consider first in our implementation is lookup which will 
be discussed further below. Individual producers and consumers, such as theo- 
rem provers, can choose to implement some or several of these DAMF processes. 
However, many aspects of dealing with linked data and IPFS will be common to 
such tools, so we describe an intermediary tool called Dispatch that simplifies the 
interactions between these producers and consumers and the DAMF global store. 
Of course, Dispatch would be considered part of the trusted code base, along with 
IPFS and any utilities used to manipulate JSON data and cryptographic signa- 
tures. If this is problematic, Dispatch can be completely foregone in preference 
to native implementations. 

The Dispatch tool is distributed as an executable dispatch with three subcom- 
mands: publish, get, and lookup. The dispatch publish command operates on 
one of a collection of standard input formats that contains local information cor- 
responding to DAMF types. After syntactically validating this input, the publish 
command will construct and publish the global objects. Dispatch can also option- 
ally interact with a specific storage service in order to make that object widely 
discoverable in the IPFS network. As an example, consider the following input 
for an assertion object, where newly created formulas and contexts are placed in 
the same file and are referred by local names such as plus_comm, and previously 
existing objects are referred by their cids using the damf: flag, such as the first 
value of “dependencies” (line 10) which refers to a formula object cid, as well as 
“language” and “mode” values which refer to existing language and tool objects 
respectively. 


1 { "format": "assertion", 

2 "agent": "localAgent", 

3 telaimu: f 

4 "format": "annotated-production", 

5 annotation cenn 

6 "production": { 

7 "mode": "damf:bafyreihnx2...", 

8 "sequent": { 

9 “conclusioni: uplus commi, 

10 "dependencies": [ "damf:bafyreihw6g...", "plus_succ" ] } } }, 
11 "formulas": { 

12 "plus_comm": { 

13 "language": "damf:bafyreidyts...", 


14 "content": “ forall MN K nat K => ...", 
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15 "context": ["plus"] }, 

16 "plus_succ": { 

17 "language": "damf:bafyreidyts..... Ta 

18 contentis WIR focali MINIK a5 alla 

19 ucontext u ae piust imk 

20 "contexts": { 

21 "plus": { 

22 "language": "damf:bafyreidyts..... R 

23 “contentes ii 

24 PKindinatktype ke Ayperi nat u pesna te nat 
25 iDeftinerplus enat SS nati prop by coo” J] Ip Jp Ip 


This example is based on an output from our Abella-DAMF prover described 
below. A prover using Dispatch tool only needs to be able to produce and con- 
sume JSON objects with this structure, without needing to interface with IPFS 
directly. The value of “agent” (line 2) refers to an agent profile in Dispatch; each 
profile maps a user-readable name to a cryptographic key-pair, created separately 
using the dispatch create-agent command. 

The dispatch get command takes a cid as an argument, fetches the IPLD dag 
(the full JSON object) referenced by it from the global store, validates the types 
of all constituent IPLD linked objects, verifies any signatures, and finally outputs 
a JSON object that is similar in structure to that accepted by dispatch publish. 
The consumer will have access to all the necessary DAMF objects referenced by 
the root cid without needing to interact with the global store or structurally 
validating any objects. The only difference between the output of dispatch get 
and the input of dispatch publish is that the local names that appeared in the 
input will be replaced by cids (i.e., global names) in the output. Input and 
output formats corresponding to other global types are described further at the 
site mentioned in the introduction.’ 

The dispatch lookup command, as mentioned earlier, is the starting process 
that we consider in our implementation regarding the combination and analysis 
of DAMF assertions. Given a formula cid and a collection of assertion cids, 
the output of this command is a list of potential sets of (agent, mode/tool) 
pairs that correspond to combinations of assertions that would yield the target 
formula. Any remaining unmatched dependency is also outputted along with 
the (agent, mode/tool) pairs. In our current implementation, Dispatch exhaus- 
tively generates all possible ways of constructing the target formula. A direct 
improvement is to change this aspect of the tool to allow for a more interactive 
and incremental exploration of such dependencies. In addition, filtering through 
allow-lists would reduce the number of assertion combinations generated by this 
command. 


3.3 Edge Systems Example: Abella 


We have implemented a DAMF-aware branch of Abella [10] as an example of a 
system that interacts with assertions in DAMF with the help of Dispatch as a 


3 https: //distributed-assertions.github.io/. 
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mediator. Abella was originally designed to test a particular approach to meta- 
theoretic reasoning using a new, proof-theoretically motivated mechanism for 
reasoning directly with bound variables (in particular, the V-quantifier [30] and 
a treatment of equality based on equivariant higher-order unification [26]). While 
the current implementation of Abella has succeeded with those meta-theoretic 
tasks [22,41], the prover has not grown much beyond that domain. Indeed, Abella 
has some (mis)features that make it a good test case for DAMF: (1) it has no 
awareness of the file system and it is easy to replace the backing store from 
local files to objects stored in IPFS; (2) it has a feature-poor proof language 
with nearly no support for proof automation and hence an underdeveloped for- 
mal mathematical libraries; and (3) it uses relational specifications as opposed 
to the more common functional programming specifications. Furthermore, the 
area of meta-theory that Abella treats declaratively is also an area many con- 
ventional proof systems do not deal well, in part, because of the need to encode 
and manipulate bindings [9,23]. Such conventional systems might be willing to 
delegate such meta-theoretic reasoning to Abella. 

Ordinary Abella developments (in .thm files) support a kind of import mecha- 
nism which loads in marshaled results from a different run of Abella. We extend 
import with a new kind of statement: Import “damf:bafyr...” that refers to a 
collection of DAMF assertions (i.e., a DAMF collection object whose elements 
are assertions). Dispatch is used to fetch all the referenced objects from IPFS as 
explained in the previous subsection. 

To appeal to an assertion, the elements of the context of the conclusion of 
the assertion are merged using their internal names with the ambient context of 
Abella where the assertion is appealed to. An Abella declaration in the context 
is mergeable if it has both the same internal name and an identical (up to A- 
equivalence) definition; thus, type and term constants are merged if they have the 
same kinds or types (respectively), and (co-)definitions are merged if they have 
the same definitional clauses. This is done to keep the implementation simple 
and mostly unchanged from the standard (non-DAMF) Abella, which also only 
allows an Import declaration when the imported objects can be merged. 

When the proof of a theorem is completed in Abella, a sequent object is 
constructed with the dependencies being all the DAMF lemmas appealed to in 
the proof, and the conclusion being the statement of the theorem (the formula) 
in the context of all its necessary declarations, computed using a dependency 
analysis. We use only the necessary declarations to allow such DAMF sequents 
to have the widest possible uses, since a DAMF assertion can only be used in 
Abella if the entire context of the conclusion can be merged. 

A full example of an Abella development that makes use of imported asser- 
tions from Abella, Coq, and AProlog can be found in [4, Appendix B]. In this 
example, Coq and \Prolog are not modified at all, and Abella is only minimally 
modified to use Dispatch to interact with DAMF assertions. The total amount of 
modifications to Abella to interface with Dispatch amounts to about 100 lines of 
code, most of which deals with (un)marshalling JSON. We expect that making 
tools DAMF-aware would require negligible effort. 
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4 Discussion: Design Choices and Alternatives 


4.1 The Role of Formal Proofs 


Autarkic theorem provers often exploit the existence of proofs for several rea- 
sons. Obviously, the ability to check a fully detailed proof object in their own 
kernel, following the De Bruijn criterion [11], is central. But proofs can also be 
used for various other roles. For example, they sometimes contain constructive 
content that can be extracted as executable programs, and they can be used 
as guides during the development and maintenance of other proofs. Given their 
central role in many proof assistants, a great deal of effort has gone into the 
formalization, manipulation, and transformation of formal proof objects; see, for 
example, MMT [35], Logipedia [21], and foundational proof certificates [18]. As 
a concrete matter, proof objects can be included in the annotations of annotated 
productions in the global store of DAMF. Sequents are linked in productions by 
their cids, so it is possible for the same sequent to have multiple proof objects 
contributed by different agents in separate assertions. 


4.2 Potential Benefits to Mainstream Systems 


The fact that proof objects are not central to DAMF and the example presented 
in Sect. 3.3 might lead the reader to believe that the only beneficiaries of DAMF 
are new systems that want to leverage existing developments in mainstream 
systems. This belief is not necessarily true for two reasons. First, there are certain 
logical systems and formalization styles that are inordinately complicated or 
impossible to do in mainstream systems. Good examples are nominal sets [34], 
A-tree syntax (a.k.a. higher-order abstract syntax) [2,23], generic judgments [30], 
and nominal abstraction [26]. It is conceivable that a mainstream prover can 
use DAMF to import a formalization such as the proof of soundness of Howe’s 
method done in the setting of higher-order abstract syntax and contextual modal 
type theory [31], which is at present not available in a mainstream proof system 
such as Coq or Agda. 

A second benefit to mainstream systems is to enable more trustworthy refac- 
toring of their existing implementations. For example, modern autarkic provers 
routinely recheck large collections of proofs, often after every invocation of a new 
instance of the proof checker and certainly after every change in the version of 
the prover. As a result of needing to recheck such proofs, there is a tendency 
for implementers of proof checkers to optimize such kernels to be more efficient. 
However, such optimizations can add greater complexity to a kernel, making 
errors in the kernel more likely to occur. With DAMF, once a trustworthy but 
slow kernel—e.g., a certified implementation of a kernel [39]—checks a proof, 
it rarely needs to be rechecked. This can even lower the pressure for kernel 
implementations to chase performance with increasing, error-prone complexity. 
Furthermore, the immutable nature of IPFS objects makes DAMF assertions 
resistant to malicious subversion of the proper execution of a tool — see, for 
example, the discussion in [5] concerning attacks on Coq’s .vo object files 
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4.3 Other Use Cases 


While it is common to view tools that perform pure computations (such as func- 
tional program execution or proof search a la AProlog) as producing assertions 
without proofs, there are various well-known reasoning systems that have been 
used a lot without being either certified or certifying: for example, Twelf [33]. 
DAMF would enable Twelf-based assertions to be exported to agents willing to 
trust its type and totality checkers. 

The relationship of DAMF to the following topics is discussed in greater detail 
in the technical report [3]: libraries as curation on top of the DAMF model of 
global objects; attacks in the adversarial environment of the web; and possible 
uses of this framework in settings (such as journalism) where the lack of formal 
proof means increasing the need to explicitly track trust. 


5 Related Work 


The semantic web [14,15] was proposed to enrich the web with aspects of trust 
and would rely on concepts and technologies such as cryptography, taxonomies, 
ontologies, and inference rules. While the semantic web and DAMF both use 
cryptographic signatures and low-level web-based technologies, DAMF differs 
from the semantic web by focusing on objects rather than documents and using 
richer notions of logic and compositional reasoning. 

Dedukti [8] is a dependently typed A-calculus augmented with rewriting. 
Dedukti can be used to produce adapters (Sect. 2.3): in particular, proofs in 
a source system can be transformed to Dedukti proofs and then transformed 
back into formal proofs in a different system. For example, the Logipedia docu- 
mentation mentions that “some proofs expressed in some Dedukti theories can 
be translated to other proof systems, such as HOL Light, HOL 4, Isabelle/HOL, 
Coq, Matita, Lean, PVS, ...” [29]. As a by-product, Dedukti can be used to build 
correctness-preserving translations of assertions for DAMF. 

TPTP [40] provides a number of standards for the concrete syntax of first- 
order and higher-order logic along with tools for parsing and printing files that 
adhere to such standards. Deploying those tools for the production of the kind 
of multilingual adapters that we have described in Sect. 2.3 is a natural next 
step for tool development within DAMF. 

The recognition that distributing some aspects of proof environments goes 
back to at least the systems described by Sacerdoti Coen, et al. [7,19]. In such 
systems, integration was meant to work between “near-peer” systems: that is, 
between systems that are both based on rich logics such as higher-order logic 
or on typed A-calculi based on the Curry-Howard correspondence. A prereq- 
uisite for successful integration in such systems is the ability to connect the 
semantics of formulas, types, universes, proofs, etc. The wide spread use of such 
integration approaches has been delayed since it has only been in recent years 
that efforts, such as Dedukti [8] and MMT [36,37], are making it possible to 
form the necessary deep and sophisticated ties between the semantics of these 
objects arising from different implementations. In contrast, DAMF allows the 
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composition of different assertions without an a priori assumption that there 
is a formal semantics that relates them. Of course, correctness is a concern in 
many (most) situations: in those cases, Dedukti and MMT encodings can be used 
to translate assertions between two provers with precise correctness assurances. 
Often, however, the integration is of a more asymmetric kind. For example, when 
integrating a system that only performs integer operations or reasons only with 
integer inequalities (operations that are available in SMT systems) with a system 
based on higher-order logic, producing adapters based on sophisticated encod- 
ings might be completely unnecessary. The DAMF system similarly allows such 
integration. 


6 Conclusion 


We have described a Distributed Assertion Management Framework (DAMF) 
designed to share assertions between agents while tracking dependencies with 
canonical content ids (cids). This framework endows assertions with reliable 
provenance using public key cryptography and distributes them globally using 
the IPFS network. We have given an example of using DAMF to import a Coq 
lemma into Abella. The biggest challenge for future work is to adapt existing 
work on language translation and proof translation (in, e.g., Dedukti) to create 
or derive adapters automatically. Another important matter for future considera- 
tion is whether to persist compositions (i.e., COMPOSE-derivations, cf. Sect. 2.4) 
to DAMF, which can serve as hints for post hoc investigations. 
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Abstract. We present ABSTRACT CNF2DDNNF, a calculus describing 
an approach for compiling a formula in conjunctive normal form (CNF) 
into deterministic negation normal form (d-DNNF). It combines com- 
ponent-based reasoning with a model enumeration approach based on 
conflict-driven clause learning (CDCL) with chronological backtracking. 
Its properties, such as soundness and termination, carry over to imple- 
mentations which can be modeled by it. We provide a correctness proof 
and a detailed example. The main conceptual differences to currently 
available tools targeting d-DNNF compilation are discussed and future 
research directions presented. The aim of this work is to lay the theo- 
retical foundation for a novel method for d-DNNF compilation. To the 
best of our knowledge, our approach is the first knowledge compilation 
method using CDCL with chronological backtracking. 


Keywords: Knowledge compilation - d- DNNF - Chronological CDCL 


1 Introduction 


In real-world applications, constraints may be modeled in conjunctive normal 
form (CNF), but many tasks relevant in AI and reasoning, such as checks for 
consistency, validity, clausal entailment, and implicants, can not be executed effi- 
ciently on them [9]. Tackling these and other computationally expensive prob- 
lems is the aim of the knowledge compilation paradigm [13]. The idea is to 
translate a formula into a language in which the task of interest can be executed 
efficiently [22]. The knowledge compilation map [22] contains an in-depth dis- 
cussion of such languages and their properties, and other (families of) languages 
have been introduced since its publication [21,25,29]. The focus in this work is 
on the language deterministic decomposable negation normal form (d-DNNF) 
[19]. It has been applied in planning [2,39], Bayesian reasoning [15], diagnosis 
[3,43], and machine learning [28] as well as in functional E-MAJSAT [40], to men- 
tion a few, and was also studied from a theoretical perspective [7,8, 10]. Several 
d-DNNF compilers are available [20,30,37,48], as well as a d-DNNF reasoner!. 


1 http: //www.cril.univ-artois.fr/ke/d- DNNF-reasoner. html. 
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Translating a formula from CNF to d-DNNF requires to process the search 
space exhaustively. The number of variable assignments which need to be checked 
is exponential in the number of variables occurring in the formula and testing 
them one by one is out of question from a computational complexity point of 
view. However, if the formula can be partitioned into subformulae defined over 
pairwise disjoint sets of variables, these subformulae can be processed indepen- 
dently and the results combined [4]. This may reduce the amount of work per 
computation significantly. Consider F = (a V b) A (c V d) defined over the set of 
variables V = {a,b,c,d}. Its search space consists of 24 = 16 variable assign- 
ments. The formula F can be partitioned into Fy = (aVb) and F> = (cVd) defined 
over the sets of variables V; = {a,b} and V2 = {c,d}, respectively, and such that 
F = F A Fo. Due to Vi N Ve = Ø, d-DNNF representations of F) and Fp can 
be computed independently and conjoined obtaining a d-DNNF representation 
of F. Moreover, in each computation we only need to check 2? = 4 assignments. 
The subformulae Fı and F> are called components due to the original moti- 
vation originating in graph theory, and the partitioning process is referred to 
as decomposition or component analysis. This approach, also called component- 
based reasoning, is realized in various exact ##SAT solvers [1,4,11,12,41,42, 47], 
and its success suggests that formulae stemming from real-world applications 
decompose well enough to generate a substantial amount of work saving. 

The formula F in our example satisfies decomposability [22], i.e., for each 
conjunction, the conjuncts are defined over pairwise disjoint sets of variables. 
We call such a formula decomposable. Negations occur only in front of literals, 
hence it is in decomposable negation normal form (DNNF) [17,18]. A formula 
in which for each disjunction its disjuncts are pairwise logically contradictory 
satisfies determinism [22], i.e., for each disjunction C1 V ... V Cn it holds that 
Ci ^Cj = L for i,j € {1,...,n} and i # j. A deterministic DNNF formula is 
said to be in d-DNNF. Determinism is also met by the language disjoint sum of 
products (DSOP), which is a disjunction of pairwise contradictory conjunctions 
of literals, and which is relevant in circuit design [5]. In a previous work [34], 
we introduced an approach for translating a CNF formula into DSOP based on 
CDCL with chronological backtracking. The motivation for using chronological 
backtracking is twofold. First, it has shown not to significantly harm solver 
performance [33,38]. Second, pairwise disjoint models are detected without the 
need for blocking clauses commonly used in model enumeration based on CDCL 
with non-chronological backtracking. Blocking clauses rule out already found 
models, but they also slow down the solver, and avoiding their usage in model 
enumeration by means of CDCL with chronological backtracking has empirically 
shown to be effective [46]. Enhancing our former approach [34] by component- 
based reasoning enables us to compute a d-DNNF representation of a CNF 
formula. Reconsider our previous example, and suppose we obtained dsop(F\) = 
aV (=a ^b) and dsop( £2) = cV (~c d). Now F = F; A Fo, hence F = dsop(F1) A 
dsop(F2) = (a V (~a A b)) A (e V (~c A d)), which is in d-DNNF. 


Our Contributions. We present ABSTRACT CNF2DDNNF, ACD for short, 
a declarative formal framework describing the compilation of CNF into d-DNNF 
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and a proof of its correctness. This abstract presentation allows for a thorough 
understanding of our method at a conceptual level and of its correctness. If 
our framework is sound, every implementation which can be modeled by it is 
sound as well. This comprises optimizations and implementation details, such as 
caches. ACD combines component-based reasoning and CNF-to-DSOP compi- 
lation based on conflict-driven clause learning (CDCL) with chronological back- 
tracking. Disjunctions with pairwise contradictory disjuncts are introduced by 
decisions and subsequently flipping their value upon backtracking, while con- 
junctions whose conjuncts share no variable are introduced by unit propagation 
and decomposition. For the sake of simplicity, in our calculus formulae are par- 
titioned into two subformulae. However, lifting it to an arbitrary number of 
subcomponents is straightforward, and a corresponding generalization is pre- 
sented. 


2 Preliminaries 


Let V be a set of propositional variables defined over the set of Boolean constants 
L (false) and T (true) denoted by B = {1,T}. A literal is either a variable 
v € V or its negation ~v. We refer to the variable of a literal Z by var(£) and 
extend this notation to sets and sequences of literals and formulae. We consider 
formulae in conjunctive normal form (CNF) which are conjunctions of clauses 
which are disjunctions of literals. A formula in disjoint sum of products (DSOP) 
is a disjunction of pairwise contradictory cubes, which are conjunctions of literals. 
Our target language is deterministic decomposable negation normal form (d- 
DNNF), whose formulae are built of literals, conjunctions sharing no variables, 
and disjunctions whose disjuncts are pairwise contradictory. We might interpret 
formulae as sets of clauses and cubes and clauses and cubes as sets of literals 
by writing C € F and £ € C to refer to a clause C in a formula F and a literal 
£ contained in a clause or cube C, respectively. The empty CNF formula and 
the empty cube are denoted by T and the empty DSOP formula and the empty 
clause by _L. 

A total variable assignment is a mapping 0: V > B, and a trail I = 1... ly 
is a non-contradictory sequence of literals which might also be interpreted as a 
(possibly partial) assignment, such that I(@) = T iff £ € I. Similarly, [(C) and 
I(F) are defined. We might interpret a trail I as a set of literals and write £ € I 
to refer to the literal 2 on J. The empty trail is denoted by £ and the set of 
variables of the literals on I by var(Z). Trails and literals can be concatenated, 
written I J and I£, given var(I) Qvar(J) = @ and var(I) Nvar(é) = 0. The position 
of £ on the trail I is denoted by 7 (J, £). The decision literals on J are annotated 
by a superscript, e.g., lt, denoting open “left” branches in the sense of the Davis- 
Putnam-Logemann-Loveland (DPLL) algorithm [23,24]. Flipping the value of a 
decision literal can be seen as closing the corresponding left branch and starting 
a “right” branch, where the decision literal ¿f becomes a flipped literal —£. 

The residual of F under I, written F|z, is obtained by assigning the vari- 
ables in F their truth value and by propagating truth values through Boolean 
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connectives. The notion of residual is extended to clauses and literals. A unit 
clause is a clause {¢} containing one single literal £. By units(F’) (units(F|z)) we 
denote the set of unit literals in F (F|r). Similarly, decs(I) denotes the set of 
decision literals on I. By writing £ € decs(I) (2 € units(F’), £ € units(F7)), we 
refer to a decision literal @ on I (unit literal in F, F|r). A trail I falsifies F, if 
I(F) = L, ie., Fly = L. It satisfies F, IK F, if I(F) = T, i.e., Flr = T, and 
is then called a model of F. If var(I) = V, I is a total model, otherwise it is a 
partial model. 

The trail is partitioned into decision levels, starting with a decision literal and 
extending until the literal preceding the next decision. The decision level function 
ô: V ++ NU{oo} returns the decision level of a variable v € V. If v is unassigned, 
d(v) = oo, and 6 is updated whenever a variable is assigned or unassigned, 
e.g., d[v +> d] if v is assigned to decision level d. We define ô(£) = ô(var(£)), 
d(C) = max{d(£) | L € C} for C # L and ô(I) = max{d(é) | £ € I} for I # e€ 
extending this notation to sets of literals. Finally, we define 6(L) = d(€) = oo. 
By writing 6[I + ool, all literals on the trail J are unassigned. The decision level 
function is left-associative, i.e., d[I + col[é + d] expresses that first all literals 
on I are unassigned and then literal £ is assigned to decision level d. 

Unlike in CDCL with non-chronological backtracking [36,44,45], in chrono- 
logical CDCL [33,38] literals may not be ordered on the trail in ascending order 
with respect to their decision level. We write I<n (Len, Ian) for the subsequence 
of I containing all literals with 6(£) < n (d(€) < n, 6(£) = n). The pending 
search space of I is given by the assignments not yet tested [34], i.e., J and 
its open right branches R(J), and is defined as O(I) = IV R(I), where R(I) = 
V cedecs(r) =5(e)(Z) and Rog (T) = 2A Ics) for £ € decs(Z). As an example, 
for I = ab? cde f, O(I) = (aN bAcAdAeN f)V (nb Aa) V (mena Abcd). 
Similarly, the pending models of F are the satisfying assignments of F not yet 
detected and which are given by F A O(I). 


3 Chronological CDCL for CNF-to-d-DNNF Compilation 


In static component analysis the component structure is computed once, typ- 
ically as a preprocessing step, and not altered during the further execution. 
In contrast, in our approach the component structure is computed iteratively 
adopting dynamic component analysis. Algorithm 1 provides a general schema 
in pseudo-code. It is formulated recursively, capturing the recursive nature of 
dynamic component analysis. Lines 1-7 and 11 describe model enumeration 
based on chronological CDCL [34], while lines 8-10 capture component anal- 
ysis. 

Now assume unit propagation has been carried out until completion, no con- 
flict has occurred and there are still unassigned variables (line 8). If F'|; can be 
decomposed into two formulae G and H, we call CNF2dDNNF recursively on G 
and H, conjoin the outcomes of these computations with J and add the result to 
M (line 9). If J contains no decisions, the search space has been explored exhaus- 
tively, otherwise chronological backtracking occurs (lines 10). The working of our 
approach is shown by an example. 
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Algorithm 1: CNF2dDNNF(F,V, J, M) 
input : CNF F,V = var(F), I =6,M =L 
output: d-DNNF M = F 

1 Loop 


2 I — PropagateUnits() 

3 if conflict occurs then 

4 if conflict level = 0 then return M else AnalyzeConflict() 

5 else if I(F) = T then 

6 Me-MvVvI 

7 if there are no decisions on I then return M else BacktrackChrono() 

8 else if F|r can be decomposed into G and H then 

9 M — MV IACNF2dDNNF(G,var(G), €, L) A CNF2dDNNF (H ,var( H), ¢,L) 
10 if there are no decisions on I then return M else BacktrackChrono() 
11 else Decide 


Example 1. Let V = {a,b,c,d,e, f,g,h} be a set of propositional variables and 
F = (a) ^A (a V 7bVcV d) Af avabvev f) A (BV acVe)A(bVdV f) A 
(g V h) be a formula defined over V. The execution is depicted as a tree in 
Fig. 1. For the sake of readability, we show only the formula on which a rule is 
executed, represented by a box annotated with its component level. Black arrows 
correspond to “downward” rule applications, while violet (gray) arrows represent 
“upwards” rule applications and are annotated with the formula returned by the 
computation of a component. Ignore the rule names for now, they are intended to 
clarify the working of our calculus which is presented in Sect. 4. We see that, first, 
a is propagated, denoted by the black vertical arrow annotated with a and the 
name of the applied rule (Unit). The residual of F under a is F|a = (~bV cV d) ^ 
(~bvevV f)A(bVacVe)A(bVdV f)A(gVh) (not shown). It contains no unit clause 
but can be decomposed into (=bV cV d) A(7bV eV f) \(bVacVe)A(bV dv f) and 
(gVh). Two new (sub)components are created (by applying rule Decompose) with 
component level 01 and 02, respectively, represented by the shadowed boxes. 
Since (gVh) can not be decomposed further, model enumeration with chrono- 
logical CDCL is executed on it (not shown) by deciding g (rule Decide) satisfying 
(g V h), followed by backtracking chronologically (BackTrue), which amounts to 
negating the value of the most recent decision g, and propagating h (Unit). The 
processing of (gVh) terminates with gVagAh (CompTrue, not shown). But before 
this result can be used further, the subcomponent at component level 01 needs to 
be processed. Its formula is G = (=bVcVd) A(>bVeV f) A(bVacVe) A (bVdV f). It 
neither contains a unit nor can it be decomposed, hence we take a decision, let’s 
say, bt. Now GI, = (cVd)A(eV f), which is decomposed into two components with 
one clause each and component level 011 and 012, respectively (Decompose). 
These formulae can not be decomposed further, and they are processed inde- 
pendently, similarly to (g V h). Before G was decomposed, a decision was taken, 
and we backtrack combining the results of its subcomponents (ComposeBack). 
We have G|_»y = (~c Ve) A (d V f) resulting in two components with component 
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F |(a)A(-aV-bVeVd) A (-aV 7bVeV f) A(bV 7cVe) A(bVdV f)A(gVh)| 0 


w aN(gV7gAh)A 


Unit (bA(eVreAd) A(eVreA f)Va>bA(cAeV7e)A(dV-7dA f)) 


(a) A (~a V abVeVd) A(maV 7bVeV f)A(bV7eVe)A(bVdV f)A(gVh)| 0 


bA(cV-7cAd) A (eV AeA f)V 


D VagAh 
ab A (cAeV ae) A (dV 7d A f) secmpoee ons 


01 |(-bVeVd)A(-bVeV f)A(bV7cVe)A(bVdV f)] |(gVA)| 02 


bA (cV 7cAd) A (eV 7eA f) ComposeEnd o> ss. ab 
Decide ComposeBack ~~~ _ 2 


01 | (-bVeVd)A(AbVeEVS)A(bV>cVe)A(bVdV f)| |((bvevd)a (Vev f) A (BV acVe)A(bVdV f)| 01 


ab A (cAeV ae) A (dV 7d A f) 


cVacAd eVjAeAf cANeV 7c dV ad^ f 
Decompose Decompose 


011 |(cvd) (ev f)| 012 011 |(-cVe) (dv f)| 012 


Fig. 1. Component structure of F created by ACD. 


levels 011 and 012, respectively. They are processed and their results combined, 
after which the results of the subcomponents of the root component are con- 
joined with a. There is no decision on the trail, and the process terminates with 
M = (a) A (~a VbV eV d) A (~a V 7bVeV f) A (BV acV e)A(bVdV f)\(gVh) 
(ComposeEnd). Notice that although component levels can occur multiple times 
throughout the computation, they are unique at any point in time. 


4 Calculus 


Due to its recursive nature, combining the results computed for subcompo- 
nents in CNF2dDNNF is straightforward. For its formalization, however, a non- 
recursive approach turned out to be better suited. Consequently, a method is 
needed for matching subcomponents and their parent. For this purpose, a compo- 
nent level is associated with each component. It is defined as a string of numbers 
in N as follows. Suppose a component C is assigned level “d” and assume its for- 
mula is decomposed into two subformulae. The corresponding subcomponents 
Cg and Cy are assigned component levels “d- 1” and “d-2”, respectively, with “-” 
denoting string composition. Accordingly, the component level of their parent C is 
given by the substring consisting of all but the last element of their level, i.e., “d’.? 
The root component holds the input formula, it has no parent and its component 
level is zero. A component is closed if no rule can be applied to it, and decomposed 
if either at least one of its subcomponents is not closed or both its subcomponents 
are closed, but their results are not yet combined. Components which are neither 
closed nor decomposed are open.® Closed components may be discarded as soon 


2 From now on, we omit the quotes for the sake of readability. 
3 The differentiation between open and decomposed components is purely technical 
and needed for the termination proof in Sect. 5. 
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as their results are combined, and the computation stops as soon as the root 
component is closed. With these remarks, we are ready to present our calculus. 

We describe our algorithm in terms of a state transition system ABSTRACT 
CNF2DDNNF, ACD for short, over a set of global states S, a transition rela- 
tion ~ C S x S and an initial global state So. A global state is a set of compo- 
nents. A component C is described as a seven-tuple (F, V, d, e, I, M, 5)*, where 
s denotes its component state. It is c if C is closed, f if F is decomposed, and 
o if C is open. The first two elements F and V refer to a formula and its set 
of variables, respectively. The third element d denotes the component level of 
C. If d# 0, then d E {d’-1,d’ - 2}, where d’ is the component level of the par- 
ent component of C, as explained above. In this manner, the component level 
keeps track of the decomposition structure of F and is used to match parent 
components and their subcomponents. The number of subcomponents of C is 
given by e, while J and 6 refer to a trail ranging over variables in V and a 
decision level function with domain V, respectively. Finally, M is a formula in 
d-DNNF representing the models of F found so far. A component is initialized 
by (F, V, d, 0, €, L, co)° and closed after its computation has terminated, i.e., 
(F, V, d, 0, I, M, 5)°. Notice that in these cases e = 0. The initial global state 
So = {Co} consists of the root component Co = (F, V, 0, 0, £, L, œ0)° with F 
and V denoting the input formula and V = var(F’), while the final global state 
is given by Sn = {(F, V, 0, 0, J, M, 6)°} where M = F is in d-DNNF. The 
transition relation ~ is defined as the union of transition relations ~+r, where R 
is either Unit, Decide, BackTrue, BackFalse, CompTrue, CompFalse, Decompose, 
ComposeBack or ComposeEnd. Our calculus contains three types of rules, which 
can abstractly be described as follows: 


a: SHW{C} ~r SUC’; B: SH{C} ~r SHC’, C1, Co}; y: SW{C,C1,Co} ~r SW{C’}. 


In this description, S refers to the subset of the current global state consisting 
of all components which are not touched by rule R, with w denoting the disjoint 
set union, e.g., in a, C,C’ ¢ S. An a rule affects a component C turning it into 
C’. The rules Unit, Decide, BackTrue, BackFalse, CompTrue, and CompFalse are 
a rules. A 8 rule modifies C obtaining C’ and creates two new components Cı 
and C2. Rule Decompose is the only @ rule. Finally, a y rule removes the two 
components Cı and Cp from the global state and modifies their parent C. Rules 
ComposeBack and ComposeEnd are y rules. The rules are listed in Fig. 2. 


Model Computation. Rules Unit, Decide, BackTrue, BackFalse, CompTrue, and 
CompFalse execute model enumeration with chronological CDCL [34] and are 
applicable exclusively to open components. Unit literals are assigned the decision 
level of their reason, which might be lower than the current decision level (rule 
Unit). Decisions can be taken only if the processed formula is not decomposable 
(Decide). Backtracking occurs chronologically, i.e., to the second highest decision 
level on the trail, after finding a model (BackTrue) and to the decision level 
preceding the conflict level after conflict analysis (BackFalse), respectively. In 
the latter case, the propagated literal is assigned the lowest level at which the 
learned clause becomes unit and to which a SAT solver implementing CDCL with 
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Unit: Sw{(F, V, d, 0, I, M, 6)°} ~>unt SW {(F, V, d, 0, 12, M, 5[24 a])°} if 
L¢F\r and exists C € F with {€}=Clr and a = 6(C \ {é}) if 


def 


C\{é} AL and a=0 otherwise 
Decide: SW{(F, V, d, 0, I, M, 5)°} ~+pecide S W {(F, V, d, 0, 14, M, 6[@+ a])°} if 
F| #7 and 1 ¢ Fi; and units(F|z) =Ø and var(£)€ V and (£) = oo and 
def 


a = 6(I)+1 and there exist no Gand H such that GAH = F|; and 
var(G) N var(H) = 0 


BackTrue: SwW{(F, V, d, 0, I, M, 6)°} ~+packtiue 
SW{(F, V, d, 0, PK£, M V I, 6[L 4 ool[€+ e])°} if Flr =T and 
PQ% I and D% -decs(I) and e+1%¥6(D) =6(I) and £€ D and 
e =ô(D\{4}) =6(P) and K “Qe. and L = Qs. 

BackFalse: Sw {(F, V, d, 0, I, M, 6)°} ~BackFalse 
SW {(F, V, d, 0, PK£, M, ô[L => œ][l > j])°} if exists C € F and 
exists D with PQ I and C|r =L and c% 6(C) = 8(D) >0 such that 
LED and =l € decs(I) and |g =L and FA=AM [ED and 
j = 6(D\ {0}) and b= 6(P)=c-1 and K= Qe, and LŽ Qs, 


CompTrue: SW{(F, V, d, 0, I, M, 6)°} ~*comptue S W {(F, V, d, 0, I, MV I, 6)°} if 
F|r=T and decs(I) = 

CompFalse: Sw {(F, V, d, 0, I, M, 8)°} ~>compraise S © {(F, V, d, 0, I, M, 5)°} if 
exists C€ F and C|z = L and (C) = 0 


Decompose: Sw {(F, V, d, 0, I, M, 8)°} ~*vecompose 
Sw{(F, V, d, 2, I, M, 6)!,(G, U, d-1, 0, e, L, 00)°, (H, W, d-2, 0, e, L, 00)°} if 
F[r#7 and LgF|r and units(F|r) =Ø and GAH F|; and 
U © var(G) and W © var(H) and UNW =0 
ComposeBack: S w {(F, V, d, 2, I, M, 6r)f, 
(G, U, d- 1, 0, Ja, N, ôa)“, (H, W, d- 2, 0, Jg, O, 5x)°} ~ ComposeBack 
Sw {(F, V, d, 0, PK£, M v (I A N AO), ô[L > oo][l > e])°} if PQI and 
D © ~decs(I) and e +1% (D) =ô(I) and ¿€ D and 
e =6(D\ {0}) =4(P) and K Qee and L& Qs. 
ComposeEnd: Sw {(F, V, d, 2, I, M, ô), 
(G, U, d- 1, 0, Ja, N, ôa), (H, W, d- 2, 0, Jg, O, 8H)? } ~*composeEnd 
Sw{(F, V,d, 0, I, MV (IANAO), 5)°} if decs(I) = 0 


Fig. 2. ACD transition rules. 


non-chronological backtracking would backtrack to. Since the literals might not 
be ordered on the trail in ascending order with respect to their decision level, a 
non-contiguous part of it is discarded. Finally, a component is closed if its trail 
contains no decisions and either satisfies its formula (CompTrue) or a conflict 
occurs at decision level zero, i.e., the conflicting clause has decision level zero 
(CompFalse). In the former case, the newly found model is recorded. 


Component Analysis. Rules Decompose, ComposeBack, and ComposeEnd cap- 
ture the decomposition of a formula and the combination of the models of its 
subformulae and thus affect multiple components. 


Decompose. The state of the parent component C with formula F is o (open). The 
trail I neither satisfies nor falsifies F, and F'|; contains no unit clause but can 
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be partitioned into two formulae G and H defined over disjoint sets of variables. 
Subcomponents for G and H are created, the number of subcomponents of C is 
set to two and its state is changed to f (decomposed). Notice that C can only 
be processed further after its subcomponents are closed. 


ComposeBack. The state of the component C with formula F is f (decomposed). 
Its subcomponents Cg and Cy with formulae G and H, respectively, have state 
c (closed). Furthermore, N = G and O = H, hence F| = I ^AN AO, which is 
added to M. This corresponds to enumerating multiple models of F in one step. 
This can easily be seen by applying the distributive laws to Z A N A O which 
gives us a DSOP formula whose disjuncts are satisfying assignments of F'|;. The 
search space has not yet been processed exhaustively (6(1) > 0), backtracking 
to the second highest decision level occurs, and the state of C is changed back 
to o (open). Finally, Cg and Cy are removed from the global state. If J can not 
be extended to a model of F, we have N = L orO=1,andIANAO=L. 
Otherwise, I A N AO Æ L. Both cases are captured by rule ComposeBack. 


ComposeEnd. The state of the parent component C with formula F is f (decom- 
posed). Its subcomponents Cg and Cy with formulae G and H, respectively, are 
closed. Furthermore, N = G and O = H, hence F|; = IA N AO, which is added 
to M. The search space has been processed exhaustively (decs() = Ø), and the 
state of C is set to c (closed). Finally, Cg and Cy are removed from the global 
state. As in rule ComposeBack, either TA NAO= 1 orIANAOF LL. 


Example 2. Reconsider Example 1 with variables V = {a,b,c,d,e, f,g,h} and 
F = (a)A(naV-7bVcV d) ^A (~a V abVeV f)A(bVacVe) A(bVdv f)A(gVh) defined 
over V. The execution trace of ACD is shown in Fig. 3. Unaffected components 
are depicted in gray, and model enumeration by means of chronological CDCL 
is shown only once in full detail. The execution starts with the root component 
Cr containing F. In step (1), the unit literal a is propagated, upon which F', is 
decomposed into (gV h) and G creating components C(gyp) and Cg shown in (2). 
Steps (3) to (6) capture model enumeration by chronological CDCL of (gVh), i.e., 
the computation of a DSOP representation of (gVh), after which C(gyp) is closed. 
Next, the formula G is processed by deciding b in step (7), decomposing G], into 
(c V d) and (e V f) and creating components Ceva) and C(eyf), respectively, in 
step (8). The processing of C(eva) and C(ey) occurs analogously to steps (3) to 
(6) resulting in the state shown in (9). The results are conjoined with b, which is 
the trail of Cg and under which G|, was decomposed. Since b is a decision, it is 
flipped in (10) to explore its right branch =b. The formula G|- is decomposed 
into (=c V e) and (dV f) and components C{~eve) and C;qyf) are created, as in 
(11). Their processing, which is not shown, results in the state depicted in (12), 
and the results are conjoined with the trail of Cg. Since its trail contains no 
decision, Cg is closed, see (13). The global state now contains the root compo- 
nent and its two subcomponents, which are closed, hence the rule ComposeEnd is 
executed, and the computation terminates with the closed root component and 
M =aN(gVr7agAh)A(bA (eV acAd) A (eV eA f)V ab/A (cAe€V ac) A (dV ad f), 
where M = F, and which is shown in (14). 
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{(F, V, 0, 0, e, L, oo = 00)°} 
unit {(F, V, 0, 0, a, L, 601 = d00[a ++ 0])°} (1) 
~Decompose {(F, V, 0, 2, a, L, ðo)”, ((g V h), {9, h}, 02, 0, £, L, 6020 = o0)°, (2) 
(G= (AbVeVd)A(AbVeEV f) A (bV 7cVe) A (bV dV f), {b,c,d,e, f}, 01, 0, £, L, oio = œ)°} 
// enumerate models of (g V h) with chronological CDCL 
~ Decide {(F, V, 0, 2, a, L, ðo)”, ((g VA), {9,h}, 02, 0, g°, L, 5021 = do20[g + 1])°, (3) 
(G, {b,c,d,e, f}, 01, 0, e, L, do10)?} 
~packtue  {(F, V, 0, 2, a, L, 601)", ((g V h), {9, k}, 02, 0, =g, g, So22 = So2ilg + co]lg + 0])°, (4) 
(G, {b,c, d,e, f}, 01, 0, £, L, 6010)°} 
~unit {(F, V, 0, 2, a, L, 501), ((g V k), {g,h}, 02, 0, =gh, g, 6023 = do22[h + 0])°, (5) 
(G, {b,c,d,e, f}, 01, 0, £, L, do10)°} 
~scomptue {(F, V, 0, 2, a, L, 501), ((g V h), {9, k}, 02, 0, =gh, g V =g Ah, do23)°, (6) 
(G, {b,c,d,e, f}, 01, 0, €, L, 6010)°} 
~> Decide {(F, V, 0, 2, a, L, 501)7, ((g V h), {9, h}, 02, 0, =gh, g V 7g ^ h, ĝo23)°, (7) 
(G, {b,c, d,e, f}, 01, 0, b, L, 6011 = do10[b > 1])°} 
~*Decompose {(F, V, 0, 2, a, L, 5o1)7, ((g V h), {g,h}, 02, 0, =gh, g V 7g Ah, 5023)°, (8) 


(G, {b,c,d,e, f}, 01, 2, bt, L, 8o11), 
((eV d), {c,d}, 011, 0, e, L, do110 = œ)°, ((e V f), {e, f}, 012, 0, £, L, 8o120 = œ%)°} 


// enumerate models of (c V d) and (e V f) with chronological CDCL 


~comptue {(F, V, 0, 2, a, L, 501)", ((g VA), {9, h}, 02, 0, =gh, g V =g Ah, 6023)°, (9) 


(G, {b,c, d,e, f}, 01, 2, bt, L, do11)f, 
((eV d), {c,d}, 011, 0, sed, c V =c Ad, do113)°, ((e V f), {e, f}, 012, 0, =ef, e V ~e A f, 5o123)°} 


// combine results 
~*composeBack {(F, V, 0, 2, a, L, ôo)”, ((g V h), {9, h}, 02, 0, agh, g V 7g ^ h, ĝo23)°, (10) 
(G, {b,c, d,e, f}, 01, 0, =b, b A (cV ne Ad) A (e V ~e A f), 5012 = ĝo11[b > o]fb + 0])°} 


a 


~*decompose {(F, V, 0, 2, a, L, do1)!, ((g Vh), {g,h}, 02, 0, ngh, g V =g Ah, 6023)°, (11) 


(G, {b, c,d, e, f}, 01, 2, =b, BA (eV =c Ad) A (e V ~e A f), 6012)%, 
((~eV e), {c,e}, 011, 0, £, L, õo1io = œ)°, ((d V f), {d, f}, 012, 0, €, L, do120 = 00)? 


// enumerate models of (~c V e) and (d V f) with chronological CDCL 


~scomptue {(F, V, 0, 2, a, L, ĉo1)f, ((g V h), {9, h}, 02, 0, =gh, g V =g A^ h, 5023)°, (12) 


(G, {b,c, d,e, f}, 01, 2, =b, b A (eV ~ac Ad) A (e V =e A f), 6012)5, 
((-e Ve), {c,e}, 011, 0, =c, cA e V ~c, ôo113)°, ((d V f), {d, f}, 012, 0, =d f, dV ad A f, 5o123)°} 


// combine results 
~+composeEnd {(F, V, 0, 2, a, L, do1)!, ((g V h), {9, h}, 02, 0, =gh, g V =g Nh, 6023)°, (13) 
(G, {b, c,d, e, f}, 01, 2, =b, b A (cV ae Ad) A (eV 7eA f) V bA (chev 7c) A (dV 7d A f), 5012)°} 


~*composeEnd {(F, V, 0, 2, a, a A (g V ng Ah) A(BA (eV me Ad) A (eV ae A f) VabA (cA eV 7c) A (dV ad A f), 6012)°} (14) 


Fig. 3. Execution trace of ACD for Example 1. 


5 Proofs 


For proving correctness, we first show that our calculus is sound by identifying 
invariants which need to hold in a sound global state and show that they still hold 
after the execution of any rule. Then we prove that for any closed component it 
holds that M = F and that ACD can not get stuck and terminates in a correct 
state. Showing termination concludes our proof. 


Definition 1 (Sound Global State). A global state S is sound if for all its 
components C = (F, V, d, e, I, M, 6)* the following invariants hold: 


(1) Vk, @ € decs(I) . r(1,k) < TU, 0) = > d(k) < 4(28) 


(2) 6(decs(I)) = {1,...,6(1)} 
(3) Vn EN. FAAM A decsen(I) H Len, provided C is open or decomposed 
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(4) M v O(I) is a d-DNNF, provided C is open or decomposed 

(5) MVFAOW)=F 

(6) e> 0 iff (A) e = 2, (B) C is decomposed, (C) S contains components Cg = 
(G, var(G), d- 1, €g, JG, N, da)*, Cy = (H, var( H), d-2, €H, JH, O, on)*, 
such that F|r = GA H and var(G) A var( H) = 0 

(7) If e =2 and S contains components Cg = (G, var(G), d-1, 0, Ja, N, 6a)° 
and Cy = (H, var( H), d-2,0, Jg, O, 6n)°, then Fl; =IANANAO 

(8) IfC is closed, then decs(I) = 0) 


Invariants (1) - (5) correspond to the ones in our previous work [34]. They say 
that decisions are ordered in ascending order with respect to their decision level 
and that every decision level contains a decision literal. They further ensure that 
literals propagated after backtracking upon finding a model are indeed implied, 
that no model is enumerated multiple times and that all models are found. 
Invariant (3) is only useful for open or decomposed components, since I remains 
unaltered when a component is closed. Invariant (4) only holds for closed com- 
ponents if I(F) = L. Invariants (6) and (7) are concerned with the properties 
of a parent component and its subcomponents (for the case c = 2), such as the 
definition of the component level. Since, given a trail I, F'|; is decomposed into 
formulae G and H, we also have that F|; = N ^O, where N = G and O = H. 
Finally, Inv. (8) says that the trail of a closed component contains no decision. 


Lemma 1 (Soundness of the Initial Global State). The initial global state 
So = {(F, V, 0, 0, £, L, 00)°} is sound. 


Proof. Due to I = £ and e = 0 and since the (root) component is open, all 
invariants in Definition 1 are trivially met. 


Theorem 1 (Soundness of ACD Rules). The rules of ACD preserve 
soundness, t.e., they transform a sound global state into another sound global 
state. 


Proof. The proof is carried out by induction over the rule applications. We 
assume that prior to the application of a rule the invariants in Definition 1 
are met and show that they also hold in the target state. The (parent) com- 
ponent in the original state is denoted by C = (F, V, d, e, I, M, 6)* and in the 
target state by C’ = (F, V, d’, e’, I’, M’, ays, Its subcomponents, if there are 
any, are written Cg = (G, var(G), d- 1, ec, J, N, 6g)*, Cu = (H, var(H), d- 
2, ex, K, O, d4)*%. Unit, Decide, BackTrue, and BackFalse: Apart from the addi- 
tional elements V, d, e and the component state s, the rules are defined as in the 
former calculus [34]. The arguments given in the proof there apply here as well, 
and after applying rules Unit, Decide, BackTrue, or BackFalse, Inv. (1) - (5) hold. 
Notice that in the proof of Inv. (4), it suffices to replace “DSOP” by “d-DNNF”, 
since the relevant property here is determinism. Since e’ = 0, Inv. (6) and (7) do 
not apply. An open state is mapped to an open state, hence Inv. (8) holds. 


CompTrue and CompFalse: Invariants (1) and (2) hold, since J remains unaffected. 
Since C’ is closed, Inv. (3) and (4) are met. The proof that Inv. (5) holds is carried 
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out similarly to the proof of Proposition 1 in our previous work [34] for rules 
EndTrue and EndFalse, respectively. Since e’ = 0 and I’ = I, Inv. (6) - (8) hold. 


Decompose: The parent component C remains unaltered except for e’ = 2 and 
for its state, which becomes f. Both its subcomponents Cg and Cy are open, and 
we have Jg = Jy = £ and eg = ey = 0. Therefore, Inv. (1) - (5) hold. Invariant 
(6) is satisfied by the definition of rule Decompose. Since C’ is decomposed and 
Ca and Cy are open by definition, Inv. (7) and (8) hold as well. 


ComposeBack: It suffices to show that the validity of the invariants for C’ is 
preserved, since Cg and Cy do not occur in the target state. The most recent 
decision literal is flipped, similar to rule BackTrue. The same argument to the 
one given there applies, and Inv.(1) and (2) are satisfied. We need to show 
that FA -(M V (IAN AO)) A^ decscn(P K£) H (PK) <,, holds for all n. The 
decision levels of the literals in P K do not change, except for the one of £, which 
is decremented from e+1 to e. The literal £ also stops from being a decision literal. 
Since 6(P K £) = e, we can assume n < e. Furthermore, FAa(M vV (IA N AO))A 
decs<n (P K ¢)) = (AIA( FARM Adecsen(L))) V(FAAMAA(N A O)Adecsen(Z)), 
since l is not a decision literal in P K£ and I<e = PK and thus Ign = (PK) <, 
by definition. By applying the induction hypothesis, we get ~I A F A AM A 
decs<n(P K £) H (P K) gn, and hence FA7(M V (IA N A O))Adecsen(P K £) H 
(PK) <,,. We still need to show that FA7(M V (IANA O))Adecse-(PK £) E £, 
as ô (0) = ein PK? after applying ComposeBack and thus £ disappears from the 
proof obligation for n < e. Notice that F \ =D H I using again the induction 
hypothesis for n = e+ 1. This gives us F A ~decs<e(P K) A a H I and thus 
F A ~ndecs<e( PK) AI H £ by conditional contraposition, and Inv. (3) holds. 

For proving that Inv. (4) holds, we consider two cases: (A) TA NAO # L, 
i.e., there exists an extension of I which satisfies F, and (B) IA NAO = L, i.e., 
all extensions of J falsify F. For both cases, we know that IV O(I) is a d-DNNF. 

(A) We need to show that M v (IA NAO) V O(PK¢2) is a d-DNNF. Due 
to (I) = e+ 1, we have O(I) = I V Reeqi(I) = I V Rge(I) V Rae+i(1). The 
pending search space of P KZ is given by O(PK¢) = PKV Rge(P K£). But 
PK=TIg. and PK£ = Ięel = Rae+1 (J), since =l € decs(I) and 6(7€) = e + 1. 
Furthermore, R<e(P K £) = Rge(P K), since £ ¢ decs(P K £) and 6(£) = e, hence 
Rge(P K£) = Ree(1). We have O(P K £) = R=e+41(I)V Rge(I), hence O(P K £)V 
I = O(I) and (MV I)V O(PK£) = M v O(I), which is a DSOP and hence a 
d-DNNF. Now J, N, and O are defined over pairwise disjoint sets of variables 
by construction, i.e., I A N AO is decomposable, and MV (IA NAO)VO(PK 2) 
is a d-DNNF. 

(B) We need to show that M V O(PK 2) is a d-DNNF. As just shown, 
O(PK£) v I = O(I). Now M V O(PK2) = M V Rge+ı (I). Recalling that 
Rge+1ı(I) is equal to O(I) without I and M V O(I) is a d-DNNF by the premise, 
M V O(P K2) is a d-DNNF as well. Therefore, Inv. (4) holds. 

For the proof of the validity of Inv. (5), given M V F A O(I) = F, the same 
two cases are relevant: (A) ITAN AO £ Land (BJ TIANAO= LL. 

(A) We have to show that M vV (IA NAO)V (FA O(PK2)) = F. From 
O(PK£)V I = O(I) we get M vV (FAO(W)) = MV (FA (O(PK2£)) VI) = 
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MV(FAO(PK2))V(FAL) = F. But FAT = IANAO. Therefore MV(FAO(L)) = 
MV (FAO(PK))V(LANAO)=MV(IANAO)V(FAO(PK®) =F. 

(B) We must show that Mv (F A O(PK ?)) = F. Similarly to (A) we have 
MV (FAO(1)) = MV(FAO(PKO)V(FAD = MV (FAO(PKO) =F, 
due to F AI = F. Therefore, Inv. (5) holds after applying rule ComposeBack. 
We have e’ = 0, and C’ is open, hence Inv. (6) - (8) trivially hold. 


ComposeEnd: It suffices to show that after applying rule ComposeBack the invari- 
ants are met by C’, since its subcomponent states Cg and Cy do not occur in 
the target state anymore. Due to I’ = I and decs(I) = Ø and since C’ is closed, 
Inv. (1) - (4) trivially hold. 

For proving that invariant (5) holds after applying rule ComposeEnd, i.e., that 
MV(IANAO)V(FAO(1)) = F, the same two cases need to be distinguished: 
(A) IANAOF Land (B) IANAO= LL. 

(A) From decs(I) = Ø, we get O(I) = I and FAO(L) = FAT. Recalling that 
FAI=IANAO, we obtain MV(IANAO)V(FAO(L)) = MV(FAO(I)) =F 
by the premise. 

(B) We have MV (IA NAO) V (FAOI) = MV (FA O())) = F by the 
premise, and Inv. (5) holds after executing rule ComposeEnd. Invariants (6) - (8) 
trivially hold, due to e’ = 0 and I’ = I and hence decs(J’) = 0. 


Corollary 1 (Soundness of ACD Run). ACD starting with an initial global 
state is sound. 


Proof. The initial state is sound by Lemma 1, and all rule applications lead to 
a sound state according to Theorem 1. 


Lemma 2 (Correctness of Closed Component State). For any closed 
component (F, V, d, 0, I, M, 6)° tt holds that M = F. 


Proof. Follows from Theorem 1, proof of Inv. (5) for rules CompTrue, CompFalse, 
and ComposeEnd, which are the only rules closing a component. 


Theorem 2 (Correctness of Final Global State). In the final global state 
Sn = {(F, V, d, 0, I, M, 6)°} of ACD, M = F holds. 


Proof. Correctness of the closed root component follows from Lemma 2. We need 
to show that the final global state contains exactly the closed root component. 
The initial global state consists of the open root component. Additional compo- 
nents are created exclusively by rule Decompose, and a parent component state 
can only be closed by rule ComposeEnd, which also removes its subcomponents 
from the global state. Hence the root component can only be closed if it has no 
subcomponents. But since the initial global state contains exclusively the root 
component, the final global state contains only the closed root component. 


Theorem 3 (Progress). ACD always makes progress. 
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Unit: Decide: 
S'w{(d,[li,...,le,2,2,..-,2],0)} S' w {(d,[li,...,le,2,2,...,2],0)} 

> ACD > ACD 

S' w {(d, [li,...,le,0,2,...,2],0)} S' w {(d,[li,...,le,1,2,...,2],0)} 
BackTrue: BackFalse: 

Sw {(d, Liye ws glee dy lea wes hvil 0)} Su { Gly ene play dtp sas sliv|],o)} 
> ACD >ACD 

S' w {(d, [gach 0, lhaa, -s Kyl, 0)} S' w {(d, flies shiny agendas tear 
CompTrue: CompFalse: 

S' wW {(d,t,o)} >acp S'w {(d,t,c)} S' W {(d,t,o)} >acp S’w {(d,t,c)} 
Decompose: 


S' w {(d,t,0)} +acp S' W {(d,t, f), (d - 1, [2,..., 2], 0), (d- 2, [2,..., 2], 0)} 


ComposeBack: 

SW {(d, (hh, cele gles 1, lk+2, seis hvil f): (d- Liti; c), (d : 2,t2,c)} 
> ACD 

S' w {(d,[h,..., lk, 0, le42, tee Uy), ol)} 

ComposeEnd: 


S' w {(d,t, f), (d - 1, ti, c), (d - 2, t2,c)} -acp S W {(d, t, c)} 


Fig. 4. Rule applications lead to smaller global states. 


Proof. The proof is conducted by induction over the rules. We show that as 
long as the root component is not closed, a rule is applicable. For the case 
S w {C}, where C = (F, V, d, 0, I, M, )° has no subcomponents, the proof 
is identical to the one showing progress in our previous work [34] replacing 
EndTrue with CompTrue and EndFalse with CompFalse, and by checking whether 
the preconditions for rule Decompose are met if rule Unit is not applicable and 
before taking a decision. Now let the global state be given by S W {C} where 
C = (F, V, d, 2, I, M, 6)f is decomposed. Due to Inv. (6), S contains Cg = 
(G, var(G), d- 1, eg, Ja, N, ðq)? and Cy = (H, var(H), d- 2, ex, Jy, O, ôg)’ 
such that F|; = GA H and var(G) Nvar(H) = 0. Assume s = c for both Cg and 
Cy . If decs(I) = 0, rule ComposeEnd is applicable. Otherwise, similarly to rule 
BackTrue, we can show that all preconditions of rule ComposeBack are met. If 
instead s € {f,o} for at least one of Cg and Cy, the non-closed component(s) are 
processed further, and as soon as both Cg and Cy are closed, rule ComposeEnd 
or ComposeBack can be applied. This proves that ACD always makes progress. 


Theorem 4 (Termination). ACD always terminates. 


Proof. We need to show that no infinite sequence of rule applications can happen. 
To this end, we define a strict, well-founded ordering >acp on the global states 
and show that S ~r T implies S >acp T for all S,7T € S and rules R in ACD. 
Global states are sets of components, and >,acp is the multiset extension of a 
component ordering >.= (>cl, tr; >cs), Where ci, tr, and >cs are orderings on 
component levels, trails, and component states, respectively. We want to compare 
trails defined over the same set of variables V, and to this end we represent them 
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DecomposeG: SW {(F, V, d, 0, I, M, 6)°} ~ DecomposeG 
Sw {(F, V, d, n, I, M, 8)! , ((Gi, Ui, d-i, 0, e, L, 00)°)h1,} if Flr ÆT and 
L¢F\r and units(F|r) = and A%™,Gi= Flr and n>2 and 
U: = var(Gi) and U;NU; =0 for 1<i,j<n and iF j 
ComposeBackG: Sw {(F, V, d, n, I, M, 6)", ((Gi, Ui, d-i, 0, Ji, Ni, 5:)°)P1} ~*composeBack6 
SW{(F, V,d,0, PK, MVIAN, 6[L cof => e])°} if PQ ST and 
D © -decs(I) and e+1* 6(D \ {£}) = 6(1) and £€ D and 
e =6(D\ {0}) =45(P) and K = Qe. and LË Qs. and N= A, Ni 
ComposeEndG: SW {(F, V, d, n, I, M, 6)!,((T, Ui, d-i, 0, Ji, Ni, 5:)°)P1} ~+composetndG 
SW{(F, V, d, 0, I, M VIAN, ô[I = 5])°} if decs(1) =0 and N= A", Ni 


Fig. 5. Generalized transition rules. 


as lists over {0,1,2}. A trail J = ¢,... 4, defined over V, where k < |V], is 
represented as [l1,...,/,,2,...,2], where l; = 0 if 4; is a propagation literal and 
l; = 1 if 4; is a decision literal. The last |V| — m positions with value 2 represent 
the unassigned variables. Trails defined over the same variable set are encoded 
into lists of the same length. This representation induces a lexicographic order 
>Iex on trails, and we define >, as the restriction of >, to {[l,-.-,djyj] | l € 
{0,1,2} for 1 < i < |V|}, i.e., we have ty >t, t2 if ty >lex tg. The ordering >+ is 
well-founded, its minimal element is [0,...,0]. The component state takes values 
in {o, f,c}, and we define >es as >Iex, 1.€., $1 cs $2 if $1 >lex 82. The minimal 
element of >es is c, hence ><, is well-founded. Given two component levels dı and 
d2, we define dı >a d2 if length (d1) < length(d2). This may seem counterintuitive 
but is needed to ensure that the execution of rule Decompose results in a smaller 
state, since both the component state and the trail of the new subcomponents 
are of higher order than those of their parent. To see that ><, is well-founded, 
recall that we consider finite variable sets. Their size provides an upper limit on 
the length of the component level representation and a minimal element of >¢.. 

Now we define the component ordering >.= (>e; +tr; ~cs). Let two compo- 
nents be Cy = (dı, t1, 51) and C2 = (dg, tz, 82). We have Cy >e C2 if Cı Æ C2 and 
dı >a dz or dı = dz and either tı >t t2 or tı = t2 and sı cs Sg. Clearly >; is 
well-founded, since >t,, >cs, and >, are well-founded. For two global states S and 
T, we have S >acp 7 if S #T and for each component C such that C is larger 
in T than in S with respect to ><, S contains a component C’ that is larger in S 
than in 7. Since >, is well-founded, also > acp is well-founded. Figure 4 shows 
that each rule application leads to a smaller global state, concluding our proof. 


6 Generalization 


The generalized rules are listed in Fig. 5. In our generalized framework, we have 
Flr = Aj, Gi, and var(G;) N var(G;) = 0 for i,j € {1,...,n} and i ¥ j (rule 
DecomposeG). Similarly to their equivalents in ACD, rules ComposeBackG and 
ComposeEndG are applicable if all subcomponents are closed. 
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7 Discussion 


We have presented ABSTRACT CNF2DDNNF, or ACD for short, a formal 
framework for compiling a formula in CNF into d-DNNF combining CDCL- 
based model enumeration with chronological backtracking [34] and dynamic 
component analysis [4]. Conflict-driven clause learning enables our framework to 
escape regions without solution early, and chronological backtracking prevents 
multiple model enumeration without the need for remembering already found 
models using blocking clauses, which slow down unit propagation. However, the 
absence of blocking clauses also prevents the use of restarts. If exclusively the 
rules Unit, Decide, BackTrue, BackFalse, CompTrue, and CompFalse are used, a 
DSOP representation of F is computed. Unit propagation is prioritized due to 
its potential to reduce the number of decisions and thus of right branches to be 
explored. Favoring decompositions over decisions may also shrink a larger part 
of the search space. Our framework lays the theoretical foundation for practical 
All-SAT and #SAT solving based on chronological CDCL. Any implementation 
which can be modeled by ACD exhibits its properties, in particular its correct- 
ness, which has been established in a formal proof. 


Comparison with Available Tools. There exist other knowledge compilers 
addressing d-DNNFs. We want to mention C2D [20], Dsharp [37], and D4 [30], 
which also execute an exhaustive search and conflict analysis. However, our app- 
roach differs conceptually from these tools in several ways. The most prominent 
ones are the use of CDCL with chronological backtracking [33,38] instead of 
CDCL with non-chronological backtracking and the way the d-DNNF is cre- 
ated. Our method generates DSOP representations of formulae which can not 
be decomposed further by an exhaustive (partial) model enumeration and then 
combines the result, while the tools mentioned above generate the d-DNNF by 
recording the execution trace as a graph [26,27]. As ACD, both D4 and Dsharp 
adopt a dynamic decomposition strategy, while C2D constructs a decomposition 
tree which it then uses for for component analysis. 


Future Research Directions. We plan to implement a proof of concept of 
our calculus in order to compare the size of the returned d-DNNF with the ones 
obtained by C2D, D4, and Dsharp. For dynamic component analysis, one could 
follow the algorithm implemented in COMPSAT [6], while dual reasoning [32] 
and logical entailment [35] enable the detection of short partial models. This 
is particularly interesting in tasks where the length of the d-DNNF is crucial. 
Dual reasoning has shown to be almost competitive on CNFs if the search space 
is small, we therefore expect that component analysis boosts its performance. 
The major challenge posed by the second approach lies in an efficient imple- 
mentation of the oracle calls required by the entailment checks. It would be 
interesting to investigate the impact of dynamic component analysis on a recent 
implementation [46] of model enumeration by chronological CDCL [34]. Cache 
structures, being an inherent part of modern knowledge compilers and #SAT 
solvers [11,16, 19, 20,30,31,37,41,42,47,49] due to their positive impact on solver 
efficiency [1], should be added to any implementation of our framework. Finally, 
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an important research topic is that of optimizing the encoding of a formula 
making best use of component analysis [14]. Related to this question is whether 
formulae stemming from practical applications are decomposable in general. 


Acknowledgements. My thanks go to Armin Biere for a fruitful discussion when I 
got stuck in a first, very raw version of the proof, and to Martin Bromberger for his 
input enhancing it. 
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Abstract. Sledgehammer, a component of the interactive proof assis- 
tant Isabelle/HOL, aims to increase proof automation by automatically 
discharging proof goals with the help of external provers. Among these 
provers are a group of satisfiability modulo theories (SMT) solvers with 
support for the SMT-LIB input language. Despite existing formalizations 
of IEEE floating-point arithmetic in both Isabelle/HOL and SMT-LIB, 
Sledgehammer employs an abstract translation of floating-point types 
and constants, depriving the SMT solvers of the opportunity to make 
use of their dedicated decision procedures for floating-point arithmetic. 

We show that, by extending Sledgehammer’s translation from the lan- 
guage of Isabelle/HOL into SMT-LIB with an interpretation of floating- 
point types and constants, floating-point reasoning in SMT solvers can be 
made available to Isabelle/HOL. Our main contribution is a description 
and implementation of such an extension. An evaluation of the extended 
translation shows a significant increase of Sledgehammer’s success rate 
on proof goals involving floating-point arithmetic. 


1 Introduction 


Interactive theorem proving is one of the more flexible and powerful formal veri- 
fication techniques available. However, finding a proof outline with intermediate 
proof steps just simple enough for a proof assistant to be able to discharge 
automatically may require a considerable amount of time and effort, even from 
a seasoned user. As an example, the seL4 micro-kernel, the product of about 
two person-years and 9000 lines of code, took a total of about 20 person-years 
and 200,000 lines of proof development to formally verify [29]. For this reason, 
increasing proof automation in interactive proof assistants is crucial to further 
broaden their applicability. 

As a way of tackling this issue, many interactive proof assistants have the 
ability to transfer the proof burden of some of the intermediate steps onto auto- 
mated reasoning systems with automatic proof methods better suited for the 
task. This approach has proven to be quite successful in bringing the number 
of required user interactions down for many types of problems, thus increasing 
productivity. 

Among these proof assistants, we find Isabelle/HOL [34] and its powerful 
proof-delegation tool Sledgehammer [36], which acts as an interface between 
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Isabelle/HOL and a number of external provers. In addition to traditional 
(resolution-based) first-order automated theorem provers (ATPs) such as E [40], 
SPASS [45] and Vampire [38] and the higher-order ATP Zipperposition [9], these 
external provers include satisfiability modulo theories (SMT) solvers such as 
CVC4 [7], veriT [15] and Z3 [31]. SMT solvers are highly specialized for reason- 
ing within certain logical theories (e.g., integers, real numbers, and bit vectors), 
and often implement decision procedures more efficient than those found in the 
automatic proof methods of Isabelle/HOL. 

Whether an external prover succeeds in solving a delegated proof obligation 
depends, among other factors, on how the proof obligation is encoded in the lan- 
guage of the prover. SMT solvers support the SMT-LIB input language [6], which 
offers both uninterpreted (free) type and function symbols that are declared by 
the user, as well as theory-specific interpreted types and operations that have a 
fixed semantics. Dedicated inference rules and decision procedures for specific the- 
ories that are available in SMT solvers are typically employed only when the types 
and operations that appear in the delegated proof obligation are interpreted. An 
abstract translation that leaves types and operations uninterpreted will deprive 
external solvers of the opportunity to make use of their dedicated decision proce- 
dures for specific background theories, and will instead have to rely on a sufficient 
set of facts being passed to the solver along with the proof obligation. 

One of the more recent additions to the growing set of theories supported by 
major SMT solvers is that of floating-point arithmetic [16]. A formalization of 
IEEE floating-point arithmetic in Isabelle/HOL has been available in the Archive 
of Formal Proofs for nearly a decade [46]. However, Sledgehammer has not yet 
caught up to this development; its SMT component does not implement an 
interpretation of floating-point types and operations. Our aim is to provide such 
an interpretation, with the purpose of increasing the success rate for floating- 
point proof obligations delegated to SMT solvers, and thereby to increase the 
degree of automation in the interactive proof process. 

As an example, let us consider the commutativity of floating-point addition. 
SMT solvers that support floating-point arithmetic typically have no trouble 
proving that x + y = y + x when they can assume that x and y denote floating- 
point numbers, and that + denotes IEEE floating-point addition (i.e., when + is 
translated as fp.add). However, if this formula is translated in an uninterpreted 
fashion, the problem becomes much harder: it now requires to show commuta- 
tivity of a user-declared function over a user-declared type. Whether the SMT 
solver will succeed in this case depends on many factors, including which addi- 
tional facts (definitions and lemmas) are passed along from the interactive proof 
assistant together with the proof obligation itself. 


Contributions. We define a formal model of floating-point arithmetic in Isabelle/ 
HOL that implements the SMT-LIB floating-point theory (Sect. 3). 

We then extend the SMT solver integration in Isabelle/HOL by adding 
support for floating-point arithmetic, i.e., by treating floating-point types and 
operations as interpreted in the translation from the language of Isabelle/HOL 
to the SMT-LIB input format. In addition to describing this extension in 
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detail (Sect.4), we provide an implementation (in the Archive of Formal 
Proofs [46]) that supports Sledgehammer. To the best of our knowledge, this 
makes Isabelle/HOL the first interactive proof assistant to employ an inter- 
preted translation for floating-point arithmetic in its integration of automated 
theorem provers. 

An evaluation (Sect. 5), performed on a representative set of floating-point 
proof obligations from interactive proof, confirms the expectation that our trans- 
lation extension significantly increases Sledgehammer’s success rate on proof 
goals involving floating-point arithmetic, albeit at the cost of lower success rates 
for proof reconstruction—at this stage, our integration typically requires the 
external SMT solvers to be trusted as oracles. 


2 Background 


In this section, we cover additional background information regarding Sledge- 
hammer and floating-point arithmetic. 


2.1 The Sledgehammer Proof Process 


When trying to prove a conjecture in Isabelle, a user may, via a simple call to 
Sledgehammer, pass along the proof obligation to several external provers, which 
will then work on the problem in parallel. The statement to be proven is used 
by a relevance filter [30] to select additional facts (axioms and previously proven 
statements) that may help in finding a proof. All of these statements are then 
translated and compiled into a file in the input format of the external prover (in 
the case of SMT solvers, an SMT-LIB input file), as illustrated in Fig. 1. 

After working on the problem, the external prover (if it does not time out) 
returns to Isabelle with its findings. At this point, if a prover reported the con- 
jecture to be true, the user can either choose to view the prover as an oracle and 
accept the conjecture as a theorem (the dashed path in Fig. 1), or make Isabelle 
try to automatically reconstruct the proof internally, based on the additional 
facts sent with the conjecture and any proof details the prover may provide. 
Theorems that are only proved externally are marked with an oracle tag, meant 
to convey a certain amount of skepticism—reconstructed proofs are generally 
preferred, as they remove the consideration of possible bugs in the external 
prover, or in the translation between formats. 

In Sledgehammer’s translation module, types and constants are generally 
declared with a unique (freshly generated) identifier that has no inherent mean- 
ing to the external prover. A few Isabelle theories (e.g., those for integer arith- 
metic, real arithmetic, and bit vectors) define types and constants that are 
treated as interpreted by the translation into SMT-LIB [11], in which case they 
are mapped directly to their counterpart in the target logic—thereby allowing 
the SMT solvers to use their built-in decision procedures designed specifically to 
reason within the theories in question. 
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Relevance filter Translation 


Proof goal Proof goal 
Conjecture f9 > f9 > 
+ facts 


Translated 
proof goal 
+ facts 
Internal proof Proof details 
< : 
Proof reconstruction | -” “ External prover 


Fig. 1. A conjecture’s journey to become a theorem via Sledgehammer 


2.2 IEEE 754 Binary Floating-Point Arithmetic 


The most common way to approximate the real numbers to a suitable finite set 
of numbers in modern hardware is via floating-points. Simulating real arith- 
metic using floating-points is not a straightforward task; the definitions of 
arithmetic operations are not always obvious, and should ideally not vary 
between implementations. To this end, the IEEE developed the technical stan- 
dard IEEE 754 [26], aiming to provide clear specifications and recommendations 
on all aspects of floating-point arithmetic. To meet the needs of different appli- 
cations, the standard specifies several floating-point formats, each defining a 
unique set of numbers. 

A binary floating-point format is characterized by its exponent width w € 
N, and its precision p € N. A binary floating-point number, x, may then be 
represented in this format by a triple (s,e, f) of bit vectors of length 1, w, and 
p—1, respectively, such that (for finite x) 


(—1)® - 21—bias(w) . (0 + $) if e=0 (1) 
Mb = 
(=1)5- ge—bias(w) , (14 otherwise, 


} pc) 


where bias(w) = 2”~! — 1. The standard also specifies two signed infinities, +00 
and —oo, denoting values that are too great in magnitude for the format. These 
are represented by the triples (0,1...1,0...0) and (1,1...1,0...0), respec- 
tively. Together, the sign s, the (biased) exponent e, and the fraction f con- 
stitute a unique representation of any finite or infinite floating-point number; 
in particular, the two numbers +0, represented by (0,0...0,0...0), and —0, 
represented by (1,0...0,0...0), are considered distinct. To represent the result 
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of invalid operations, such as 0/0, the standard defines a special Not-a-Number 
(NaN) value, represented via any triple (s,1...1, f) such that f 40...0.1 

Additionally, IEEE 754 specifies various arithmetic operations on floating- 
point numbers. Conceptually, floating-point arithmetic is carried out by convert- 
ing floating-point numbers to more precise values, performing the corresponding 
arithmetic operation, and converting the result back to the original floating-point 
format, in an emulation of a rounded infinitely precise calculation. In an envi- 
ronment like Isabelle/HOL, where theories of real arithmetic are available, the 
task of carrying out calculations with infinite precision falls upon these, whereas 
the floating-point operations handle the rounding and special cases (e.g., an 
argument being NaN or infinite). IEEE 754 specifies precisely how this handling 
should be performed. 


3 An Implementation of SMT-LIB Floating-Point 
Arithmetic in Isabelle/HOL 


Formalizations of floating-point arithmetic are readily available for many proof 
assistants. For Isabelle/HOL, a formalization originally developed by Lei Yu is 
available from the Archive of Formal Proofs [46]. This defines a (polymorphic) 
type of floating-point numbers, whose instances correspond to IEEE floating- 
point formats with specific width and precision, and various arithmetic opera- 
tions over this type. 

However, although both are based on the IEEE standard, there are impor- 
tant semantic differences between this model and the SMT-LIB floating-point 
theory [16]. These differences would have rendered a direct interpretation of Lei 
Yu’s model in the SMT-LIB floating-point theory unsound. 

First, the SMT-LIB theory offers five rounding modes. The mode round- 
NearestTiesToAway (which is optional according to IEEE 754) was not available 
in the Isabelle/HOL model. Therefore, the enumerated type of rounding modes 
in Isabelle/HOL did not correspond to the RoundingMode sort in SMT-LIB. We 
have resolved this difference by adding support for roundNearestTiesToAway 
to Lei Yu’s model. Although rounding is pervasive in IEEE—it is performed by 
most arithmetic operations—it is factored out into only two functions in the 
Isabelle/HOL model (round and intround), so that this was a relatively minor, 
local change. 

Second, the formalization by Lei Yu emphasizes the bit representation of 
floating-point values (corresponding to specification level 4 in IEEE 754), while 
the SMT-LIB floating-point theory takes a more abstract view (corresponding 
to specification level 2 in IEEE 754). Specifically, in Lei Yu’s formalization, 
each floating-point format contains multiple NaN values (with different bit rep- 
resentations), while the corresponding floating-point format in SMT-LIB only 


1 The IEEE 754 standard defines a quiet and a signalling NaN. This distinction is not 
present in the SMT-LIB floating-point theory, which is based on a higher level of 
abstraction. 
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contains a single (abstract) NaN value. To resolve this fundamental difference, 
we have constructed a new model of floating-point arithmetic in Isabelle/HOL. 
Our starting point is a quotient construction over the type (’e,’f) float of 
floating-point numbers offered by Lei Yu’s model. We first define an equivalence 
relation is_nan_equivalent on this type that relates all NaN values: 
definition is_nan_equivalent :: (’e,’f) float > (’e,’f) float > bool 

where is_nan_equivalent ab = a = bV (is-nan a ^ is_nan b) 

We then define a new type (°e,’ f) floatSingleNaN that contains the equiva- 
lence classes of (’?e,’£) float with respect to the relation is nan_equivalent: 
quotient _type (overloaded) (’e,’f) floatSingleNaN = 
(e,’f) float / is nan_equivalent 

The resulting type (’e,’£) floatSingleNaN contains a single (abstract) NaN 
value. The (type) arguments ’e and ’f indicate the bit width of the exponent 
and fraction, respectively. A similar construction, but limited to the double- 
precision (64-bit) format, was used in [8] to facilitate OCaml code generation 
for floating-point numbers. Flocq [14], a Coq library of floating-point arithmetic, 
defines a type with similar semantics inductively, rather than using a quotient 
construction. 

Most floating-point operations can then be lifted [25] in a straightforward 
manner from (’e,’f) float to (’e,’f) floatSingleNaN. We have addition- 
ally defined various operations that are supported in SMT-LIB but that were not 
available in Lei Yu’s model, such as conversion functions between floating-point 
numbers and bit vectors. Our model now covers all operations that are available 
in the SMT-LIB floating-point theory. 

Some (rather subtle) semantic differences between our model and the SMT- 
LIB floating-point theory remain. In SMT-LIB, the result of certain opera- 
tions, such as converting NaN or infinities to a real number, is unspecified. 
Isabelle/HOL does not support partial specifications; therefore, the result of 
these operations is defined? in our model. Technically, the Isabelle/HOL model 
is an implementation of the SMT-LIB specification. This does not affect the 
soundness of interpreting the model in SMT-LIB: any theorem provable under 
SMT-LIB semantics also holds for the Isabelle/HOL model. 

An error in the remainder function float_rem as defined in Isabelle/HOL 
was discovered during implementation and has been patched: the remainder of 
a finite floating-point value x and +oo shall be æ [26, §5.3.1]. 


4 Interpreting Isabelle/HOL Floating-Point Arithmetic 
in SMT-LIB 


This section describes an interpreted translation of floating-point types and oper- 
ations from Isabelle/HOL to SMT-LIB. Our translation extends a preexisting 
general translation [11] targeting SMT solvers that is part of Sledgehammer, 
which treats floating-point arithmetic as uninterpreted. It supports the formal 


? For instance, in terms of a special constant called undefined. 
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model of IEEE floating-point arithmetic in Isabelle/HOL that was described in 
the previous section. We aim to be comprehensive but restrict attention to those 
floating-point concepts that are defined in both Isabelle/HOL and SMT-LIB. 


4.1 SMT-LIB Logic 


The first task of our translation module is to select an SMT-LIB logic within 
which the SMT solver is to reason when deciding the satisfiability of the formula. 
For performance reasons, it is generally a good idea to select a logic that is as 
specific as allowed by the contents and structure of the formula. However, FP, the 
logic for floating-point arithmetic, is too restrictive for many of Isabelle’s proof 
obligations, which may freely combine floating-point operations with other types 
and constants. When translated, these will require support for symbols that are 
either free (uninterpreted) or defined in other SMT-LIB theories. 

Sledgehammer’s SMT integration relies on callback functions to analyze the 
proof obligation and determine the problem’s logic. However, only one of these 
functions may select a logic. In the absence of a framework allowing for a more 
modular approach (e.g., incrementally generalizing the logic as little as necessary, 
based on the types and constants that appear in the proof obligation), we need 
to select a logic that covers all operations that appear in the proof obligation. To 
achieve this, whenever a supported floating-point type is detected in the formula 
to be translated, our callback function returns the (pseudo-)logic ALL. Available 
since version 2.5 of the SMT-LIB standard, this provides a convenient way to 
select the most general logic that the respective SMT solver supports. 


4.2 Types 


Both Isabelle/HOL and SMT-LIB define binary floating-point formats of arbi- 
trary width of the exponent and fraction fields. In Isabelle/HOL, (m,n) float- 
SingleNaN is the type of floating-point numbers with an exponent field of width m 
and a fraction field of width n (and thus with precision n+1). In SMT-LIB, 
the hidden bit of the significand (the bit preceding the fraction) is included in 
the format specification, making (_ FloatingPoint m n+1) the corresponding 
sort. The SMT-LIB sorts are only defined for formats with m > 1 and n > 0, 
whereas m and n are merely required to be positive in Isabelle/HOL. Thus, any 
type (1,n) floatSingleNaN lacks a corresponding sort in SMT-LIB, and is left 
uninterpreted by the translation. 

In Isabelle/HOL, all floating-point formats (m,n) floatSingleNaN are 
instances of a polymorphic type (’e,’f) floatSingleNaN. Here, ’e and ’f 
are type variables that may be instantiated with concrete (type) arguments, or 
left uninstantiated to express generic properties that hold for all floating-point 
formats. Due to the current lack of support for polymorphism in SMT-LIB, 
(m,n) floatSingleNaN is interpreted only when m and n are (type) arguments 
encoding fixed numeric values; polymorphic types are left uninterpreted. 

In addition to the types for floating-point formats, Isabelle/HOL defines an 
enumerated type roundmode for the rounding modes used by the arithmetic 
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operations. SMT-LIB provides a corresponding type; roundmode is interpreted 
as RoundingMode in SMT-LIB. 


4.3 Constants 


For the sake of brevity, we focus here on some of the more interesting aspects 
of the translation of constants. (In HOL, constants are not limited to arity 0, 
but may have a function type.) An exhaustive enumeration of the mapping is 
provided in Table 1. 


Polymorphism. The issue regarding polymorphism, described in the previous 
section, affects the translation of constants as well. A constant can only be inter- 
preted if its type is not polymorphic. Since Isabelle’s automatic type inference 
assigns constants the most general type possible with respect to the context, 
variables and constants with a floating-point type will in many cases need to be 
attached with explicit type constraints in order to trigger the interpretation. 


Direct Correspondence. For many floating-point related constants in Isabelle, 
there is a direct semantic-preserving mapping to a function in SMT-LIB. Among 
these we find, e.g., the rounding modes and comparison operations together 
with many arithmetic operations and classification predicates. The translation 
of these does not involve much more than simply replacing their name with the 
corresponding identifier in SMT-LIB. 


Format Parameter Extraction. A few SMT-LIB functions targeted by our trans- 
lation are technically elements of an infinite family of functions generated by 
an index over all floating-point formats. This holds, e.g., for the conversion 
operation from reals to floating-points, and for the (nullary) functions denot- 
ing the special floating-point values +0, too and NaN. Their behavior depends 
on the result sort, which is not necessarily derivable from context and must 
be indicated explicitly in SMT-LIB. In these cases, we extract the type argu- 
ments of the (result) type of the constant to be translated, and add them 
explicitly as arguments to the corresponding function symbol in SMT-LIB. 
For instance, the Isabelle/HOL function round of type roundmode > real > 
(?e,’f£) floatSingleNaN, which converts a real number into a floating-point 
number (rounding as necessary), is interpreted as (_ to_fp m n+1) whenever its 
result type is of the form (m,n) floatSingleNaN, where m and n encode fixed 
numeric values. 


Term Translation. Isabelle/HOL supports the definition of advanced concepts on 
top of the types and constants that are provided by the model of floating-point 
arithmetic. Our translation does not interpret such derived concepts directly. 
Instead, these can be handled by unfolding their definitions in Isabelle when 
desired, or by relying on Sledgehammer’s relevance filter, which can make their 
definitions and other relevant facts available to external provers automatically. 
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Table 1. Types and constants in Isabelle/HOL covered by the translation, together 
with sorts and functions in SMT-LIB. m > 1 and n > 0 indicate the floating-point 
format. Square brackets denote syntactic sugar, which is also interpreted. 


ISABELLE/HOL SMT-LIB 
Floating-point type (m,n) floatSingleNaN (_ FloatingPoint m n+1) 
Rounding mode type roundmode RoundingMode 
Bit-vector type m word (_ BitVec m) 
Rounding mode roundNearestTiesToEven RNE 
Rounding mode roundNearestTiesToAway RNA 
Rounding mode roundTowardPositive RTP 
Rounding mode roundTowardNegative RIN 
Rounding mode roundTowardZero RTZ 
Value construction fp fp 
Positive infinity plus_infinity [oo] (_ +00 m nti) 
Negative infinity minus_infinity (_ -oo m nt+1) 
Positive zero zero_class.zero [0] (_ +zero m nti) 
Negative zero minus_zero (_ -zero m nt1) 
Not-a-number NaN (_ NaN m nt1) 
Absolute value abs_class.abs [i fp.abs 
Negation uminus_class.uminus [-] fp.neg 
Addition add fp.add 
Subtraction sub fp.sub 
Multiplication mul fp.mul 
Division div fp.div 
Fused multiply-add mul_add fp.fma 
Square root sqrt fp.sqrt 
Remainder loat_rem fp.rem 
Integral rounding intrnd fp.roundToIntegral 
Less or equal le fp.leq 
Less than 1t fp.1t 
Greater or equal ge fp.geq 
Greater than gt fp.gt 
IEEE equality eq fp.eq 
Normal? is_normal fp.isNormal 
Subnormal? is_subnormal fp.isSubnormal 
Zero? is_zero fp.isZero 
Infinity? is_infinity fp.isInfinite 
NaN? is-nan fp.isNaN 
Negative? is_negative fp.isNegative 
Positive? is_positive fp.isPositive 
To real valof fp.to_real 
To unsigned word unsigned_word_of float fp.to_ubv 
To signed word signed_word_of _float fp.to_sbv 
From IEEE word float_of _IEEE754_word (_ to_fp m nt1) 
From real round (_ to_fp m n+1) 
From float float_of_float (_ to_fp m nt+1) 
From signed word float_of_signed_word (_ to_fp m n+1) 


From unsigned word float_of_unsigned.word (-_ to_fp_unsigned m n+1) 


226 O. Torstensson and T. Weber 


5 Evaluation 


To investigate the difference in the performance of Sledgehammer brought on 
by the interpreted translation, and to get a clear overview of the comparative 
performance of the SMT solvers, we conducted an experimental evaluation on 
a set of proof obligations that involve floating-point operations. Freely available 
Isabelle formalizations of floating-point properties are scarce; only a few proper- 
ties are included with the formal IEEE model in the Archive of Formal Proofs. 
We complemented these with our own formalizations of floating-point properties 
taken from the IEEE 754 standard and the Handbook of Floating-point Arith- 
metic [32], resulting in a set of 124 formulas. The formulas in the evaluation set 
exhibit difficulties ranging from nearly trivial to levels on par with Sterbenz’s 
lemma [42]. 

All formulas in the evaluation set are polymorphic over a single floating- 
point type (’e,’f) floatSingleNaN. This type was instantiated to different 
fixed-size floating-point formats: half (16-bit), single (32-bit), double (64-bit), 
and quadruple (128-bit) precision formats, as specified by IEEE 754. The inter- 
preted translation was evaluated on each of these fixed-size formats. For com- 
parison, the abstract (uninterpreted) translation that was previously employed 
by Sledgehammer was additionally evaluated on the original (polymorphic) eval- 
uation set. This gives rise to nine different models—technically, Isabelle theories 
with different type annotations—for measuring Sledgehammer’s performance on 
the evaluation set, defined for x € { (5,10), (8,23), (11,52), (15,112) } as: 


— I, : interpretation is enabled and all floating-points are of type 
x floatSingleNaN. 


— Ux : interpretation is disabled and all floating-points are of type 
x floatSingleNaN. 


— Upoly: interpretation is disabled and all floating-points are of polymorphic 
type (’e,’f) floatSingleNaN. 


We used the Mirabelle [17] tool with default settings—including a 30s time 
limit per formula—to apply Sledgehammer to each proof obligation. The default 
external provers invoked by Sledgehammer in Isabelle2022 are the ATPs E (ver- 
sion 2.6-1), SPASS (version 3.8ds-2), Vampire (version 4.6) and Zipperposition 
(version 2.1-1), along with the SMT solvers CVC4 (version 1.8), veriT (ver- 
sion 2021.06.2-rmx), and Z3 (version 4.4.0_4.4.1). Since the floating-point solver 
in this version of Z3 suffers from a soundness bug, we evaluated Z3 version 4.12.2 
instead. We did not evaluate newer versions of the other solvers, such as cvc5 [3], 
as they are not yet integrated with Isabelle. 

Out of the three SMT solvers, only CVC4 and Z3 support the floating- 
point theory of SMT-LIB. For each of the nine models, we evaluated four dif- 
ferent prover configurations: CVC4 only, Z3 only, CVC4+Z3, and Sledgeham- 
mer’s default prover configuration, which includes all of the ATPs and SMT 
solvers listed above. For the Z, models, where interpretation is enabled, the 
default prover configuration uses both interpreted and uninterpreted translations 
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(depending on the prover). For CVC4, we enabled its experimental floating-point 
solver (option --fp-exp) to obtain support for floating-point formats beyond 
single and double precision. 

Sledgehammer’s relevance filter had access to a large collection of theorems 
from the Isabelle/HOL library, including the definitions of all types and oper- 
ations, and (for later formulas in the evaluation set) to all formulas that were 
evaluated earlier. This mimics realistic use in interactive proof, where users can 
rely on proven statements and employ them as lemmas in subsequent proofs. To 
avoid later runs being affected by earlier runs, the status of the machine learning 
selection of facts (stored in the Isabelle configuration file mash_state) was reset 
before each Mirabelle run. 

The experiments were conducted under Debian GNU/Linux 6.1.0-10-amd64, 
running on an i9-9980HK CPU at 2.4 GHz with 16 processor threads and 32 GB 
of main memory. 


5.1 Results 


Table2 shows Sledgehammer’s success rates for the four different prover con- 
figurations when run on the evaluation set in the models described above. For 
convenience, the four fixed formats are abbreviated by their total bit length (16, 
32, 64, and 128, respectively) in the model name. Sledgehammer succeeds when 
at least one of the external provers reports that it found a proof within the time 
limit of 30s. 


Table 2. Sledgehammer’s success rates for the four prover configurations on proof 
goals from the evaluation set, by model. 


Ure | Lig |U32 |Z32 | Usa | Zea | U128 | L128 | Upoly 
CVC4 41% 94% | 57% | 91% | 35% | 90% | 58% | 89% | 54% 
Z3 39% | 86% | 56% | 85% | 35% | 84% | 56% | 77% | 58% 
CVC4+Z3 41% | 95% | 58% | 91% | 36% | 90% | 58% | 89% | 57% 
Default (all) 41% | 94% | 60% | 91% | 37% | 91% | 60% | 88% | 56% 


In this case, Sledgehammer attempts to reconstruct the external proof in 
Isabelle using a collection of automated proof methods (as discussed in Sect. 2.1). 
The success rates for this process, again as a percentage of the total number (124) 
of proof obligations, are shown in Table 3. 

For each floating-point format (and also for the polymorphic model), the 
largest success rate across prover configurations, with or without interpretation 
enabled, is indicated in boldface. 
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Table 3. Success rates of proof reconstruction for the four prover configurations on 
proof goals from the evaluation set, by model. 


Ure |Tie (U32 T32 |Uea | Zea Uis | L128 Upoly 
CVC4 41% | 5% | 55% | 5% | 35% | 5% 54% | 5% 54% 
Z3 39% | 4%|54% | 4% | 35% | 4% 53% | 4% | 58% 
CVC4+Z3 41% | 5% |55% | 5% | 36% | 5% 56% | 5% 57% 
Default (all) 40% | 7% |58% 7%|37%| 7% 57%| 7% | 54% 


5.2 Discussion 


Based on the results of our evaluation, we put forward the following observations: 


1. 


An interpreted translation increases Sledgehammer’s success rate for all 
prover configurations and fixed-size floating point formats. With an uninter- 
preted translation, success rates vary between 35% and 58%. This increases to 
between 77% and 95% with an interpreted translation. Across the board, the 
interpreted translation performs significantly better than the uninterpreted 
translation. 

The increase in Sledgehammer’s success rate is most pronounced for the half 
(16-bit) and double (64-bit) precision formats. The uninterpreted translation 
performs worse for these two formats (with success rates of 35% to 41%) 
than for single and quadruple precision. In contrast, the interpreted transla- 
tion consistently yields high success rates (of 89% to 95% in the best solver 
configuration) regardless of the format’s precision. 

Sledgehammer’s success rate on the polymorphic model is generally compara- 
ble to, and in some cases better than, its success rate for fixed-size formats with 
an uninterpreted translation. When the external provers cannot take advan- 
tage of their decision procedures for fixed-size floating-point arithmetic, rea- 
soning about fixed-size properties is no easier for them than reasoning about 
polymorphic properties. (Indeed, depending on the additional facts chosen by 
Sledgehammer’s relevance filter, it may well be harder.) This changes when 
interpretation is enabled. 

CVC4 outperforms Z3 on most models. This is true both with and without 
interpretation enabled. The only exception is the polymorphic model, where 
Z3 performs slightly better than CVC4. Using all available provers typically 
results in (only) slightly higher success rates than using CVC4 alone, but can 
also lead to slightly lower success rates (mainly because of non-determinism 
in Sledgehammer’s behavior). 

With interpretation disabled, proof reconstruction success rates are often close 
to Sledgehammer’s success rates. In other words, proof reconstruction in the 
uninterpreted models succeeds on the vast majority of proofs found by exter- 
nal provers. This is a testament to the power of Isabelle’s built-in proof meth- 
ods (in particular, metis), which provide strong automation for first-order 
reasoning. 


Hammering Floating-Point Arithmetic 229 


6. Interpretation leads to (much) lower proof reconstruction rates for all prover 
configurations and fixed-size floating point formats. Although interpretation 
allows external provers to find more proofs, these proofs are rarely successfully 
reconstructed in Isabelle. This is to be expected: Isabelle currently does not 
offer built-in automated proof procedures for floating-point reasoning that 
could be used to reconstruct such proofs. 


Many formulas from the evaluation set were previously proven with 10-20 lines of 
interactively developed Isabelle proof script, and can now (after interpretation) 
be proven completely automatically by CVC4 or Z3. The interpreted translation 
can save significant amounts of human labor in formal proof developments that 
involve floating-point arithmetic. However, due to the lower proof reconstruction 
rate, interpretation of floating-point arithmetic is currently primarily of interest 
to users who are willing to accept CVC4 and Z3 as oracles (cf. Sect. 2.1). 


6 Related Work 


The practice of employing automatic provers as back-ends in interactive theorem 
provers is not unique to Isabelle. Generic proof-delegation tools similar to Sledge- 
hammer have also been developed for other proof assistants, e.g., MizAR [43] 
for Mizar [2], and HOL(y)Hammer [27] for HOL Light [22] and HOL4 [41]. 
There are also proof-delegation tools aimed specifically toward SMT solvers, 
e.g., Smtlink [37] for ACL2 [28] and SMTCogq [1] for Coq [10]. 

Single integrations of SMT solvers have perhaps been more common than 
these larger-scale tools. The interactive theorem prover PVS [35] is tightly con- 
nected with the SMT solver Yices [18] (and its predecessor ICS), which has been 
available as a decision procedure for a long time. An oracle integration of Yices in 
Isabelle by Erkök and Matthews [20] makes use of its dedicated decision proce- 
dures, but refrains from translating into SMT-LIB, and instead targets the native 
input format of Yices due to its expressiveness. Weber [44] proposes a similar 
oracle integration of Yices into HOL4, but extends it with support for additional 
SMT solvers via the SMT-LIB format. This integration has since been supple- 
mented with proof reconstruction and become part of HOL(y)Hammer [13]. 

The work presented here is based on the original integration of SMT solvers 
in Isabelle’s Sledgehammer by Blanchette et al. [11]. It is dependent on vari- 
ous aspects of their translation into SMT-LIB, including the interpretation of 
bit-vector types and constants. In this sense, it also bears resemblance to how 
SMTCogq was recently extended with dedicated support for the theory of bit 
vectors [19]. 

Formalizations of IEEE 754 floating-point arithmetic are readily available in 
interactive proof assistants, e.g., in HOL Light [23], ACL2 [39], and Coq [14], and 
have been used extensively to verify floating-point related properties. However, 
to the best of our knowledge, no integration of SMT solvers in interactive proof 
assistants takes advantage of the dedicated decision procedures for floating-point 
arithmetic available in the former. 
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Superficially, the work perhaps most similar to ours is a Why3 [12] formal- 
ization of floating-point arithmetic and its mapping to the SMT-LIB floating- 
point theory [21]. Why3, however, is not a prover itself, but a stand-alone proof- 
delegation tool relying completely on external provers. Thus greater automation 
in interactive proof assistants is not a shared objective. 


7 Conclusions 


In the years since its introduction in Isabelle, Sledgehammer has seen a number 
of improvements. In varying degree, they have each gradually brought us closer to 
the ultimate goal of powerful proof automation in interactive proof assistants. 
By defining a formal model of floating-point arithmetic in Isabelle/HOL that 
implements SMT-LIB semantics, and by enhancing the translation from Isabelle 
to SMT-LIB with an interpretation of floating-point types and constants, we 
have taken another step in this direction. Sledgehammer enjoys a significant 
increase in success rates (before proof reconstruction) for proof obligations that 
involve floating-point arithmetic. 

Many proof obligations that were previously out of reach for any automated 
prover can now be solved automatically. For users who are willing to trust the 
external SMT solvers, enhancing Sledgehammer’s translation with a floating- 
point interpretation increases proof automation and reduces the manual effort 
required to construct proofs in this important application domain. 

Our translation does not require formulas to be fully interpretable in the 
SMT-LIB floating-point theory. The SMT solvers are instructed to reason in a 
more general logic, where interpreted and uninterpreted sorts and functions can 
be combined freely. 

There are two notable limitations, which we propose to address through 
future work. First, the interpretation of floating-point arithmetic is restricted 
to fixed-size formats. In many situations, this is not a severe limitation—fixed- 
size reasoning is sufficient, for instance, when one wants to verify a specific 
hardware architecture, or a software implementation that uses a specific floating- 
point type such as binary64. However, floating-point properties that hold for 
all formats are most naturally stated polymorphically in Isabelle/HOL. Such 
properties cannot be interpreted in the floating-point theory of SMT-LIB, which 
(in its current version 2.6) lacks support for polymorphism: although it offers a 
type (_ FloatingPoint m n) for any sufficiently large m and n, it does not offer 
a polymorphic type (_ FloatingPoint m n) where m and n are variables that 
may be instantiated. 

Supporting polymorphism in SMT solvers is no small feat. Fortunately, there 
is ongoing work to obtain a tighter integration of automatic provers, including 
SMT solvers, with proof assistants. One of the means by which to achieve this is 
via support for higher-order logic in these provers [5]. Most likely, SMT-LIB 3— 
the next major update to SMT-LIB—will facilitate these changes by supporting 
polymorphism [4]. When such support becomes available in SMT solvers that 
support floating-point arithmetic, an interpreted translation can be employed 
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also for polymorphic floating-point properties. There has already been work on 
supporting parametric bit-vector formulas in SMT solvers by encoding them as 
formulas over non-linear integer arithmetic, uninterpreted functions, and uni- 
versal quantifiers (the UFNIA logic in SMT-LIB) [33]. This approach could in 
principle be extended to floating-point numbers. 

Second, interpretation of floating-point arithmetic allows SMT solvers to find 
more proofs, but reduces proof reconstruction rates in Isabelle. There is a mis- 
match between the reasoning capabilities of SMT solvers that support floating- 
point arithmetic and Isabelle’s built-in automated proof procedures, which are 
used to reconstruct proofs. The latter currently do not offer dedicated support 
for floating-point reasoning, but need to rely on explicit lemmas to reason about 
concepts for which the SMT solver, when interpretation is enabled, can employ 
specialized decision procedures. Users may opt to bypass proof reconstruction 
and use external SMT solvers as oracles; however, this reduces trust in the result- 
ing theorems, as errors in the SMT solver, in the translation from Isabelle/HOL 
to SMT-LIB, or in the Isabelle/HOL model of floating-point arithmetic could 
lead to unsound results. The approach preferred by the interactive theorem prov- 
ing community is that of a skeptic [24]—external proofs should be reconstructed 
internally. If successful, this approach combines the speed of the SMT solver 
with the reliability of the proof assistant. 

Efficient reconstruction of proofs has previously been achieved for other SMT- 
LIB logics [11], and is likely possible also for floating-point reasoning, through 
improving on the proof information provided by SMT solvers and translating 
theory-specific inferences. An automated proof procedure for floating-point arith- 
metic implemented on top of Isabelle’s inference kernel would both facilitate 
the reconstruction of external proofs and increase the built-in automation for 
floating-point reasoning available in Isabelle/HOL. The implementation of such 
a proof procedure will require substantial work, but the evaluation results in 
this paper—in particular, the difference between Tables2 and 3—clearly indi- 
cate that the effort would not be wasted. 
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Abstract. Interactive theorem provers are today increasingly used to 
certify mathematical theories. To formally prove a theorem, reasoning 
procedures called tactics are invoked successively on the proof states 
starting with the initial theorem statement, transforming them into sub- 
sequent intermediate goals, and ultimately discharging all proof obliga- 
tions. In this work, we develop and experimentally evaluate approaches 
that predict the most likely tactics that will achieve particular desired 
transformations of proof states. First, we design several characterizations 
to efficiently capture the semantics of the proof transformations. Then 
we use them to create large datasets on which we train state-of-the-art 
random forests and language models. The trained models are evaluated 
experimentally, and we show that our best model is able to guess the right 
tactic for a given proof transformation in 74% of the cases. Finally, we 
use the trained methods in two applications: proof shortening and tactic 
suggesting. To the best of our knowledge, this is the first time that tac- 
tic synthesis is trained on proof transformations and assists interactive 
theorem proving in these ways. 


Keywords: Interactive theorem proving - Machine learning - Neural 
networks 


1 Introduction 


Interactive theorem provers (ITPs) [15] are sophisticated systems used for con- 
structing machine-verified proofs. Various proof assistants, such as HOL4 [31], 
HOL Light [14], Lean [23], Isabelle/HOL [24], and Mizar [3], are used by formal- 
izers. Coq [33] is one of the most popular proof assistant systems. Coq formalizers 
invoke reasoning procedures called tactics that transform proof states into sim- 
pler proof states, eventually discharging all proof obligations and thus proving 


the initial proof state. 
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Theorem rev_length : V 1: list nat, length (rev 1) = length 1. 
Proof. 
intros 1. induction 1 as [| n 1’ IH1’]. 
- reflexivity. 
- simpl. rewrite — app_length. simpl. rewrite — IHl’. 
rewrite add_comm. reflexivity. 
Qed. 


Fig. 1. A formal Coq proof, showing the equality property of the lengths of a list and 
its reverse 


To give a simple example, we show a Coq proof of the equality of the lengths of 
a list and its reverse (Fig. 1). To complete the proof, one can perform induction 
on the list 1 (with the help of the tactic induction 1 as [| n 1’ IH1’]), 
splitting the proof state into a case where 1 is empty and a case where 1 is 
nonempty. In the first case, the goal reduces to length (rev []) = length 
[], which is easily discharged using simple computation. In the second case, we 
obtain the induction hypothesis IH1’ that states length (rev 1’) = length 
1? and need to prove that the equation still holds when the original list has a 
natural number n prepended to it. After some simplification, we transform the 
length of the concatenation of two lists into the summation of their individual 
lengths. Then, with the help of the induction hypothesis, we simplify the goal. 
Finally, we rewrite the goal by the commutative property of addition and obtain 
a simple equation to prove. 

A Coq proof state consists of a list of hypotheses and a goal that needs 
to be proven. Given a proof state before the tactic application, the tactic may 
either transform the before state to several after states or finish the proof. The 
semantic of a tactic is captured by the (usually infinite) set of proof state trans- 
formations that can potentially be generated by that tactic. In this work, we 
approximate that infinite set with a finite dataset of transformations that occur 
in real proofs written by Coq users. We then use machine learning models to 
gain an understanding of tactics using their approximated semantics. 

As an example, Fig.2 presents the before and after states of the tactic 
rewrite add_comm at its position in Fig. 1. In this particular case, the hypothe- 
ses remain unchanged, but in the goal, the two sides of the addition are swapped. 


n: nat n: nat 
1’ ; list nat 1’ : list nat 
TH1’ : length (rev 1’) = length 1’ IHl’? : length (rev 1’) = length 1’ 
eee Soe a ee eee ee (1/1) eae ae See Sa ean LA) 
length 1’ + 1 = S (length 1’) 1 + length 1’ = S (length 1’) 

(a) Before state (b) After state 


Fig. 2. The before and after states of rewrite add_comm in Fig.1, with hypotheses 
above the dashed line and the required goal below it. 
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In this paper, we consider the machine learning task of predicting a tactic 
capable of generating a given proof state transformation and investigate the 
applications of this task. Formally, given a before state ps and n after states 
{ps'\1.n, we attempt to predict a tactic t that transforms ps to {ps"}1..n such 
that ps; is equal to ps; modulo a-equivalence for every i. 


1.1 Motivation 


Tactic prediction methods have so far relied solely on before states, typically 
to guide automated tactical proof search in systems like Tactician [6]. We are 
interested in synthesizing tactics based both on the before and after states for a 
number of reasons. 

First, there are multiple interesting applications of this task. For example, 
formalizers may want to arrive at a particular proof state, given a particular 
initial proof state. Or, given particular before and after states that were gener- 
ated with a sequence of tactics, we may want to find a single tactic capturing 
the transformation, thus shortening and simplifying the proof, and teaching the 
formalizer how to use the available tactics. 

Second, our work is the first step to designing a novel human-like proof 
search strategy. When mathematicians write pencil-and-pen proofs, they often 
first imagine some intermediate goals and then sequentially fill in the gaps. This 
provides another motivation: our trained predictors can recommend the tactics 
that will bridge the gaps between such intermediate human-designed proof goals. 

Third, the task can be of particular importance for the ITPs which support 
constructing proofs in a declarative proof style, such as Isabelle, Mizar, and 
Lean. In declarative-style proofs often the after states are specified by the user 
manually. A large formal library, Mizar Mathematical Library [2], is developed 
in a declarative style. The Isabelle Archive of Formal Proofs (one of the most 
developed libraries today) is also predominantly written in a declarative style. 
Our approach can be directly applied to predict tactics able to fill the gap 
between two subsequent declarative statements. 

Finally, the learned tactic embeddings could be used to perform MuZero- 
style [30] reinforcement learning, which means obtaining the after states by com- 
bining the embeddings of the before states and of the tactics without actually 
running the ITP. This could be particularly useful when some tactic applications 
require large computational resources. 


1.2 Contributions 
The main contributions of our paper can be summarized as follows. 


1. To our best knowledge, we are the first to predict tactics based on the trans- 
formation they make between before and after states. 

2. In Sect. 2, to capture the semantics of tactics, we design three characteriza- 
tions: feature difference, anti-unification, and tree difference. 
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3. In Sect.4, we conduct experiments to verify the strengths of our characteri- 
zations with a random forests classifier and the GPT-2 language model. 

4. In Sect. 5, we propose and evaluate two applications of the task, namely tactic 
suggestion and proof shortening. 


Besides the above-mentioned contributions, Sect. 3 introduces the preliminaries 
of the learning technology used in this paper. We discuss two related research 
fields in Sect.6. The conclusions and future work are presented in Sect. 7. 


2 Proof State Characterizations 


To train the machine learning models, we need to provide characterizations of 
the before and after states. Apart from directly using the unprocessed textual 
representation of proof states, we design three characterizations: feature differ- 
ence, anti-unification, and tree difference. 


2.1 Feature Difference 


To characterize the proof states, we start with the features used by [42]. In that 
work, the features were used to apply machine learning to predict tactics for 
proof states. For example, GOAL-$1’ and HYPS-Coq.Lists.List.rev-$1’ are 
two features extracted from the before state in Fig.2. The prefixes GOAL and 
HYPS denote whether a feature belongs to the goal or the hypotheses. The sym- 
bol $1’ denotes a node that occurs in the abstract syntax tree (AST) of the 
proof state. The prefix $ means that 1’ denotes a named variable. We sub- 
sequently consider the nodes connected in the AST. For example, the feature 
Coq.Lists.List.rev-$1’ means that the identifier of the reversion operation 
of a list and the list 1’ are connected in the AST. 

For the current work, we additionally consider feature difference. From the 
before state ps and after states {ps'}1..n, we extract features f and {f’}i.n, 
respectively using the procedure discussed above. We define f’ as the union of 
{f’ti.n. By set difference, we compute the disappeared features f — f’ and the 
appearing features f' — f. The disappeared features and appearing features are 
together used as feature difference characterization of the tactic. 


2.2 Anti-unification 


Anti-unification, first proposed by Plotkin [27] and Reynolds [29], aims to cal- 
culate generalizations of the given objects. Since Coq is based on the Calculus 
of Inductive Constructions (CIC) [25], an appropriate anti-unification algorithm 
for Coq should be higher-order. However, higher-order anti-unification is unde- 
cidable [26]. Therefore, we first convert Coq terms to first-order terms so that 
we can execute a decidable and efficient first-order anti-unification algorithm. 
To encode Coq terms into first-order logic, we transform them recursively 
following the AST. First-order applications and constants are encoded directly, 
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State 
we ae 
Hyps Goal 
N 
n A TH?’ | 
| 
p list | | 


| a Bhs 


nat length length  Var0 Varli S 


rev r length 


lig r 


Fig. 3. The least general generalization of the before and after states in Fig. 2 


other applications use the apply functor app and all other cases use special 
first-order functions (e.g., a dependent product is encoded as a first-order func- 
tion prod). The goal of the before state in Fig. 2 will be converted to the first- 
order term = (+(length(l’), S(O)), S(length(l'))). The non-leaves =, +, length, S 
denote function symbols. The leaves l’ and O denote constants. 

Terms in first-order anti-unification are defined as t := x | a | f(ti,...,tn) 
where x is a variable, a is a constant, f is an n-ary function symbol, and t is 
a term. In this paper, letters s,t,u denote terms, letters f,g,h denote function 
symbols, letters a,b denote constants, and letters x,y denote variables. Substi- 
tutions map variables to terms and are usually written in the form of sets. We 
can represent a substitution ø as a set {x +» a(x) | x 4 o(ax)} where a(x) 
is the term mapped by x. The application of a substitution o to a term t is 
represented as to. If t is a variable, then to = o(t). If t = f(ti,...,tn), then 
to = f(tio,...,tna). A term u is called a generalization of a term t if there exists 
a substitution ø such that uo = t. For instance, the term f(g(x), y) is a gener- 
alization of the term f(g(a), h(a, b)). The substitution ø is {x > a, y > h(a, b)} 
such that f(g(x), y)o = f(g(a), h(a, b)). 

Anti-unification aims to obtain the least general generalization (lgg) of two 
terms s and t. A term u is called a generalization of s and t if there exist 
substitutions cı and a2 such that uo, = s A uog = t. A generalization u’ of s 
and t is called the lgg if, for any generalization u of s and t, there is a substitution 
g, such that u's = u. Assuming ¢ is a bijective function from a pair of terms to 
a variable, given two terms s and t, the anti-unification algorithm AU calculates 
the lgg using the two rules below. 


— AU(s,t) = f(AU(s1,t1),..., AU(Sn,tn)) if s = f(s1,..,8n), t= f(ti,..., tn) 
— AU(s,t) = ¢(s,t) if the preceding rule does not match. 


Figure 3 presents the lgg of the before and after states considered in Fig. 2. 
Compared to the before state, most of the nodes in the lgg remain the same. 
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The differences stay in the left side of the equality in the goal: length 1? is 
substituted with Varo, and the natural number 1 is substituted with Vari. We 
need to apply the substitutions {varo > length I’, var, +> 1} and {varo => 
1, varı +> length l'} to the lgg to obtain the before and after states, respectively. 

We compute the lggs of the goals and the hypotheses separately. We can 
directly anti-unify the goals of the before and after states. However, the num- 
ber of hypotheses may be changed by the tactic application. For instance, the 
tactic intros introduces new hypotheses, while the tactic clear H removes the 
hypothesis H. Suppose we are anti-unifying the hypotheses hyps(h1,...,n) and 
hyps(hi,...,hn,hn41). The first rule of anti-unification immediately fails, and 
the second rule will generate a variable that corresponds to all hypotheses in the 
before state and all hypotheses in the after states. Therefore, anti-unifying all 
hypotheses together prevents us from developing a compact characterization. To 
calculate the lggs of hypotheses, we first match the hypotheses with the same 
names. Then, we compute an lgg on each pair. We refer to the hypotheses that 
are only in the before state and only in the after state as respectively deleted 
hypotheses and inserted hypotheses. Different from the pairwise hypotheses, we 
do not perform anti-unification on the deleted hypotheses and inserted hypothe- 
ses, and they remain unchanged. 

We choose anti-unification because it can generate a more compact repre- 
sentation compared with directly utilizing the before and after states. Consider 
Fig. 2, we need a Coq string of the before state and another Coq string of the 
after state to characterize the transformation. Notice that many parts of the 
before state are unchanged after the tactic application. It is redundant to repre- 
sent these unchanged parts twice in both the before and after states. However, 
anti-unification enables us to use a single lgg and the substitutions to character- 
ize the transformation. The unchanged parts of the before and after states are 
shared in the lgg. Moreover, previous research has demonstrated that features 
based on generalization are very helpful for theorem proving [19]. 


2.3 Tree Difference 


In addition to anti-unification, we propose a characterization based on a tree 
difference algorithm [21]. Compared to anti-unification, tree difference is better 
at generalizing the differences between the before and after states. Tree differ- 
ence extends the standard Unix diff [16] algorithm by the capability to compute 
the differences according to the tree structures. Since proof states have tree 
structures, such tree differences can be used to characterize the transformations. 

Take the before and after states in Fig.2 for demonstration. First, for the 
hypotheses that are the same in the before and after states, we keep them 
unchanged. Therefore, the hypotheses n, 1’, and IH1’ remain the same. 

The next step is to extract common subtrees from the original trees (except 
for the unchanged hypotheses) to obtain more compact characterizations. We 
focus on the ASTs of Coq terms. Assuming there is an oracle to judge whether 
the current subtree is a common subtree, we traverse a tree from the root. The 
calculation of the oracle is explained in the original paper [21]. If the current 
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subtree is a common subtree and not a leaf node, we substitute it with a hole. 
We do not substitute leaves with holes because, in practice, the substitutions of 
leaves lead to many unexpected holes. The same common subtrees should always 
be substituted with the same hole. The results of applying the substitutions to 
the before and after states are called the deletion context and the insertion 
context, respectively. After the substitutions, the deletion and insertion contexts 
are shown in Fig. 4. 

Afterward, we calculate the greatest common prefix (gcp) of the deletion and 
insertion contexts and obtain a patch. According to the original algorithm, if the 
two trees have the same non-hole node, we keep the node unchanged and execute 
the algorithm on their children. Otherwise, we denote them as a change. 


State State 
Hyps Goal Hyps Goal 
/ N / N 
Re IH)’ | n rae TH!’ | 
| | /\ | | a 
nat list = + 2 nat list = + 2 
| 7N i | ZN / \ 
nat length length 0 1 nat length length 1 0 
| | 
by l by r 
| | 
iy V 
(a) Deletion context (b) Insertion context 


Fig. 4. The deletion and insertion contexts of the before and after states in Fig. 2. 
Hole0, Hole1, and Hole2 denote length 1’, 1, and S(length 1’), respectively. 


State 
Hyps Goal 
gl | 
n i THI’ = 
| | | Vo 
nat list = + Biase 
| 7 so 
nat length length 0>1 TS 
| 
by F 


Fig. 5. The patch of the before and after states in Fig. 2 
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Similar to anti-unification, due to the deletion, insertion, and reordering 
of the hypotheses, we need to adjust the gcp algorithm for proof states. We 
match hypotheses by their names and obtain the deleted hypotheses, inserted 
hypotheses, and matched hypotheses as in Sect.2.2. We only calculate gcps 
on the matched hypotheses. The deleted hypotheses and inserted hypotheses 
are represented as a change. Executing gcp on proof states returns a patch in 
the format of state(hyps_patch, goal_ patch) where hyps_patch is constructed 
by hyps(hy, ..., hn, change(del_hyps,ins_hyps)). Each h; is the patch of two 
matched hypotheses. Figure 5 depicts the patch of the before and after states in 
Fig. 2. 


ee State 
Hyps Goal 
n | 
n P THY’ = 
) | | yee o 
nat list at aL 2352 


| we o 


nat length length 0 


Fig. 6. The result of applying the closure function to the patch in Fig. 5 


Subsequently, we need to calculate the closure of a patch. The intention 
is to confirm that every change is closed: the left and right sides contain the 
same holes. Notice that the patch in Fig.5 contains two unclosed changes, 
Change (Hole0, Hole1) and Change(Hole1, Hole0). The closure function will 
go to the subtree, whose root is the parent node of the unclosed change. Then, 
restore the subtree with the deletion and insertion contexts before we exe- 
cute gcp on them. The procedure repeats until all changes are closed. Since 
the gcp function on proof states also returns a patch in a tree structure, we 
can run the closure function on it. If any patch of matched hypotheses h; or 
change(del_hyps,ins_hyps) are not closed, we restore the hyps_patch with 
the original deletion and insertion contexts of the hypotheses. Then, if the 
goal_patch or the deletion and insertion contexts of the hypotheses are not 
closed, we restore the patch of the proof states with the entire deletion and 
insertion contexts of the two proof states. Figure6 depicts the patch after the 
execution of the closure function. 
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The final step is to replace the identical changes with their origin term. The 
original algorithm may cause identical changes, such as Change(Hole2, Hole2) 
in Fig. 6. Since we want a compact characterization, they are not necessary. 

Tree difference is better at generalizing the differences compared to anti- 
unification. Take the example in Fig.2 for instance. The lgg in Fig.3 merely 
shows that the proof state changes in the position of the variables. The substitu- 
tions may be different if we execute rewrite add_comm on different proof states. 
However, in the patch generated by the tree difference in Fig. 6, the changes are 
generalized because we substitute common subterms with holes and will be the 
same even if we execute rewrite add_comm on different proof states. 


2.4 Input Formats 


During training, the language model receives the string 
<Characterization> Tactic: <Tactic> as input. <Characterization> has 
four variations: 


— Before:<Before State> 

— Before:<Before State> After: [<After State>] 

— Anti: [<Substs> <Delete_hyps> <Insert_hyps> <Lgg>] 
TreeDiff:[<Patch> <Hole>] 


A proof state is represented as a sequent <Hyps> |- <Goal>. The plain texts 
(like Tactic: ) serve as prompts, while the placeholders (such as <Before State> 
and <Tactic>) are substituted according to the proof context. [] denotes a list. 
During prediction, the language model receives <Characterization> Tactic: 
as input and outputs the predicted tactics. 

Random forests are fed discrete features as input. For feature difference, 
the disappeared features and appearing features are distinguished from each 
other (appearing features and disappeared features as introduced in Sect. 2.1). 
To utilize anti-unification, we convert the lgg and the terms in the substitution 
that should be used to obtain the before and after states to features in three 
disjoint spaces. For anti-unification, we also distinguish the features of deleted 
hypotheses and inserted hypotheses from other ones. For tree difference, we 
distinguish the gcp of the proof states, the origin and the destination of changes, 
and the common subterms into four spaces. 


3 Learning Models 


We consider two machine learning models for the task. The models will be com- 
pared experimentally in the next section. 

The first model is a random forest classifier [7]. Random forests are based 
on decision trees. In decision trees, leaves represent labels (tactics in our case), 
and internal nodes correspond to features. A rule is a path from the root to 
a non-leaf. It represents the conjunction of all features on the path. A rule is 
determined by maximizing the information gain of examples. For instance, if we 
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have examples with labels {b, b, b,a,a}, we want to generate a rule that passes 
all examples with the label a to its left child and all examples with the label b 
to its right child. A forest makes predictions by voting based on a large number 
of decision trees. Random forests contain several sub-forests. Each sub-forest 
is built on a random subset of the entire dataset. We choose a random forest 
implementation that has previously been used to predict tactics for Coq [42]. 

The other used machine learning technique is the pre-trained language model 
GPT-2 [28]. GPT-2 is based on neural networks, which consist of many artificial 
neurons to learn from training data. The self-attention [35] technique is inten- 
sively applied in GPT-2 to differentially weigh every part of the input data. 
As a language model, GPT-2 predicts the probability distribution of the next 
word given a sequence of words as the input. GPT-2 is a pre-trained language 
model. The concept of pre-training imitates the learning process of humans. 
When humans encounter a new task, humans do not need to learn it from 
scratch. They will transfer and reuse their old knowledge to learn to solve it. 
Similarly, GPT-2 is pre-trained on a large natural language dataset BooksCor- 
pus [43]. Afterward, GPT-2 can reuse the knowledge of natural language learned 
from pre-training to solve new tasks. To be adapted to a new task, we need to 
fine-tune GPT-2 on a relatively small dataset and slightly modify the weights 
learned from pre-training. We decide on GPT-2 because pre-trained language 
models have recently demonstrated outstanding achievements in natural lan- 
guage process (NLP) [8] and formal mathematics [34,39]. 


4 Experiments 


We perform the experiments on the dataset extracted from the Coq standard 
library. The dataset consists of 158,494 states extracted from 11,372 lemmas. 
We randomly split the dataset into three subsets for training, validation, and 
testing in an 80-10-10% ratio. First, we use 100 trees by default and opti- 
mize the Gini Impurity [22]. Gini Impurity is a metric of the information gain. 
After the optimization, we set the Gini Impurity to its best value, try various 
numbers of trees and obtain the optimized number of trees. Finally, the best 
combination of Gini Impurity and the number of trees is determined for each 
characterization. The experiments with GPT-2 are based on the Hugging Face 
library [38]. In particular, we employ the smallest GPT-2. The hyper-parameters 
are: eta = 3e — 4,num_beams = 3, batch__size = 32. During training, we apply 
a linear schedule with the first 20% training steps for warm-up. The remain- 
ing parameters are left as their default values. At most 50 tokens are predicted 
for a single tactic. We truncate the input on the left side if it is longer than 
the maximal length limitation of GPT-2 (1024 tokens). Language models have 
length limitations for efficiency. The attention mechanism used by them causes 
a quadratic usage of memory as the length of tokens scales. Every model is 
trained for 25 epochs on an NVIDIA V100 GPU, and the snapshot with the 
highest accuracy on the validation dataset is selected for testing. 

Table 1 depicts the results of our experiments. The accuracies of the combina- 
tions of before states with after states are significantly better than only relying 
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Table 1. Results on the test dataset, showing how often the prediction makes the same 
transformation as the tactic in the library. The transformations are considered modulo 
a-equivalence. 


random forests | GPT-2 
before 43.23% 46.84% 
before after 52.17% 67.45% 
feature difference | 59.34% = 
anti-unification | 58.59% 71.74% 
tree difference 58.98% 73.83% 


on the before states in both random forests and GPT-2. Thus, we conclude 
that taking after states into consideration is very helpful to learn the seman- 
tics of tactics. The accuracies of GPT-2 are significantly higher than random 
forests, which confirms that the pre-trained language model is a more advanced 
machine learning technique compared to random forests. For random forests, 
all of the feature difference, anti-unification, and tree difference perform better 
than the unprocessed before and after states. This indicates that our character- 
izations can extract more precise features for random forests. We do not apply 
GPT-2 to feature differences, as it relies on natural language. In principle, it 
would be possible to give it feature differences directly as input, but as there 
are very few similarities between features and natural language it would be a 
serious disadvantage to the model. The knowledge grasped by pretraining is dif- 
ficult to be used to understand features. Although feature difference is a little 
better than anti-unification and tree difference, their results are quite similar. 
The probable explanation is that random forests are not good at learning from 
sophisticated features. Random forests cannot learn meaningful knowledge from 
all three characterizations and almost only learn to make correct predictions for 
the simple tactics. Similarly, with GPT-2, anti-unification and tree difference 
provide more accurate predictions than the unprocessed before and after states. 
We suppose the explanation is that we are able to appropriately shorten the 
length of the input and also keep important information about the proof trans- 
formation. Appropriately shortening the input length is beneficial for GPT-2 
because it has a maximal limitation on the number of input tokens. Table 2 
compares the percentages of the inputs that are longer than the maximal length 
limitation. The statistics show that our implementation significantly reduces the 
probability that the input is over the maximal length limitation. Tree difference 
can provide more accurate predictions compared to anti-unification with both 
random forests and GPT-2. This may be attributed to that the generalization 
made by tree difference is easier to learn by machine learning models. 
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Table 2. The ratios of how many inputs exceed the maximal length limitation 


tree difference 
3.90% 


anti-unification 
4.07% 


before after 
7.96% 


before 


ratio | 2.07% 


5 Applications 


In this section, we propose two promising applications of the task. We only 
evaluate the most accurate of the methods proposed in the previous Sect. 4 
(GPT-2) on the two tasks. 

The first, more direct application, is making tactic suggestions. Given a before 
state, it is common for an ITP user to have an intuition of the intermediate proof 
states that are necessary to complete the proof. However, sometimes the user 
cannot guess the appropriate tactic needed to make the transformations. Using 
our model with the before state and the imagined intermediate states, the user 
can get a complete proposed proof as output. Hence, our model will predict the 
likely tactics to perform the transformations. 

The other application is shortening existing Coq proofs. Specifically, for the 
transformation ps9 ++, PS1 >t, PS2- =t, PSn+41, where ps is a proof state and 
t is a tactic, we want to predict a tactic t such that psy >v ps’ where ps’ and 
PSn+1 are equal under a-equivalence. Thus, we can replace the tactic sequence 
with a single tactic and decrease the length of the Coq proof. A restriction for this 
task is that because we are only interested in exploring shorter paths between 
proof states, ps, , should not be a finishing state. 


Table 3. The first five tactics suggested by each characterization. The tactics displayed 
in bold result in the desired after states. 


before before after anti-unification tree difference 
1 | trivial rewrite <- minus_n_0 |rewrite <- minus_n_0 rewrite sub_O_r 
2 | simpl rewrite sub_0_r rewrite Nat.sub_O_r | rewrite Nat.sub_0_r 
3 rewrite <- minus_n_0|rewrite<— minus_n_0|simpl simpl 
4| rewrite<— plus_n_O | simpl rewrite sub_O_r rewrite<— sub_0_r 
5 | auto rewrite<— sub_0O_r rewrite<— plus_n_O | apply sub_0_r 


5.1 Tactic Suggestion 


We view the experiments in Sect. 4 as the evaluation of tactic suggestions. The 
before and after states extracted from the Coq standard library are considered 
as the states that are presented in the Coq editor and those in users’ minds, 
respectively. The results show that taking the after states into consideration, 
together with the more compact characterization, is essential for correctly sug- 
gesting tactics. 
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The following is an actual tactic suggestion question taken from the Coq 
Discourse Forum!. The question can be summarized as finding a tactic that 
transforms the following before state to the after state. The goal of the before 
state is to prove that the element indexed by m — 0 in a list equals the element 
indexed by m. 


— Before state: 1: list nat, x:nat, m : nat, HO: 1 <= m |- nth 
(m - 0) 10 = nthm1 0 

— After state: 1 : list nat, x:nat, m : nat, HO : 1 <= m |- nthm 1 
O= nthm1 O0 


Table 3 shows the first five tactics predicted by each model. If we consider only 
the before state, we will obtain the correct prediction in the third place. However, 
the first two synthesized tactics using anti-unification, tree difference as well as 
unprocessed before and after states are appropriate. Besides the tactics displayed 
in bold, other tactics do not perform the expected transformation due to various 
reasons. Some tactics such as trivial, simpl, and auto do not change the proof 
state. The tactics rewrite <- plus_n_Oand apply sub_0_r are not applicable 
and cause errors. The lemma minus_n_0O used in rewrite <- minus_n_0 does 
not exist in the Coq standard library. Although rewrite <- sub_O_r does not 
cause an error, it leads to an unexpected after state 1 : list nat, x:nat, m : 
nat, HO: 1 <= m |- nth (m - 0) 1 O = nth m 1 0 - O. Since the opera- 
tions executed by trivial, simpl, and auto are quite complicated and may 
depend on the context, we assume it is difficult for the model to comprehen- 
sively understand them. Their occurrences in the first five predictions may be 
mainly because they occur quite frequently in the training data. The results 
confirm that the combination of before and after states is beneficial for suitably 
suggesting tactics. 


5.2 Shortening Proofs 


The results presented in the previous Sect. 4 focused on decomposed tactics. This 
means compound tactic expressions that perform several steps at once have been 
decomposed into individual tactic invocations. We apply the technique that is 
developed by [5] to decompose the tactics. Here, we utilize the same models; 
however, we focus on the original human-written tactics and try to shorten 
these (shortening expanded tactics would be unfair). For all tactic sequences of 
lengths two and three in the training dataset, we input their before and after 
states into the model. In our experiment, we can only consider the states in 
the training dataset since our model is trained on all present tactics. Compared 
to the validation dataset and testing dataset, our model should be able to give 
better predictions on proof shortening for the training dataset. The amount of 
original tactics in the training dataset is 56,788. The model synthesizes 10 tactics 
for each sequence, and we execute them in Coq to verify that they perform the 
same transformation as the sequence modulo a-equivalence. 


' https://coq.discourse.group /t /how-to-avoid-awkward-assertions/1153/2. 
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Table 4. The shortening ratios and amounts of redundant tactics with different char- 
acterizations and sequence lengths. 


length | before | before after | anti-unification | tree difference 

2 ratio 0.379% | 0.824% 0.891% 0.833% 
number | 215 468 506 473 

3 ratio 0.039% | 0.148% 0.151% 0.148% 
number | 22 84 86 84 


The results are presented in Table4. We define the number of redundant 
tactics of PSo =t, PS1 >t, PS2- >t, PSn+1 aS n. The shortening ratio is defined 
as the number of all discovered redundant tactics divided by the total number 
of occurrences of tactics in the training dataset. In this section, our method 
only applies to a tactic sequence that, besides the last tactic, every intermediate 
tactic produces a single after state. While in Sect. 4, our experiments apply to 
tactic applications that may produce several after states. The reason is that it 
is difficult to calculate the number of redundant tactics if intermediate tactics 
produce several after states. The tactic sequence will become a tree of tactics, 
and each path consists of a sequence of tactics. We initially expected that the 
shortening ratios would not be very high because of the selected dataset. Indeed, 
the Coq standard library is written by Coq experts and has been edited and 
improved for decades, so we expected that there is not much room to improve. 
However, given the size of the dataset, the proposed technique can find a number 
of redundant tactics, which lets us conclude that taking the after states into 
consideration is useful for proof shortening. 

We discover many interesting cases, where proofs can be optimized. We 
present two examples of such proofs in Table 5. The first is about the Riemann 
integral where ring and field denote algebraic structures. The Coq user first 
substituted a subterm in the proof state, rewrote the goal by several lemmas, 
and finally applied a lemma about rings. However, our model discovers the non- 
trivial transformation on ring can be completed with a single transformation in 
field. 

In the second example, the Coq library authors first applied the lemma 
Qle_lteq to transform the goal into a disjunction. Later, they selected the left 
side of the disjunction to continue the proof. Our model is able to figure out that 
the operation is redundant. Indeed it finds another lemma Q1t_le_weak that is 
able to immediately transform the goal to the left part of the disjunction. 

In addition to such more impressive examples of simpler, shorter proofs, our 
model is also able to find a few abbreviations. Such abbreviations make the proof 
shorter but do not necessarily improve their readability. For instance, our model 
sometimes combines unfold Un_growing and intro into intros x y P H n. It 
uses the implicit mechanism of intros to unfold Un_growing. However, a Coq 
user will not be able to understand what operation intros x y P H nconducts 
without actually executing the Coq script. 
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Table 5. Two examples of shortening of proofs using the prediction. In both of the 
presented cases, a single tactic provides an equivalent transformation as a sequence of 
tactics. Since the hypotheses are not changed in any of the presented examples, we 
omit them and only present the goals for simplicity. 


1 field makes the same transformation as 
(Tacticl. Tactic2.) 


State 1 = (x - (x + hO)) * - / hO 

Tacticl | replace (x - (x + h0O)) with (- h0); [ | 
ring ] 

State 1 =- ho * - / h0 


Tactic2 | rewrite Ropp_mult_d istr_l_reverse; 


rewrite Ropp_mult_distr_r_reverse; 
rewrite Ropp_involutive; apply Rinv_r_sym 


State ho <> 0 


2 apply Qlt_le_weak makes the same 
transformation as (Tacticl. Tactic2.) 


State (Qabs (xn p - yn q) <= 1 # z * k)⁄%Q 


Tacticl | apply Qle_lteq 


State (Qabs (xn p - ynq) <1#2z%* kK)ZQV 
(Qabs (xn p - yn q) == 1 # Zz * k)ZQ 


Tactic2 left 
State (Qabs (xn p - yn q) <1 #z * k)7ZQ 


6 Related Work 


Several problems originating in formal mathematics and theorem proving have 
been considered from the machine learning point of view. One of the most 
explored ones is premise selection [1]. The goal of this task is to find lemmas 
in a large library, that are most likely to prove a given conjecture. For premise 
selection, the meaning of dependency in formal mathematics has been explored 
using both approaches that try to explicitly define the logical semantics [19], 
as well as approaches that use deep learning for this [36]. Next, it is possible 
to apply machine learning to guide inference-based theorem provers. As part of 
this task, implicitly the meaning of provability and step usefulness are derived 
by the learning methods. This has been explored in the two top-performing first- 
order theorem provers [17,32] as well as in higher-order logic automated theorem 
proving [10]. Similarly, the meaning of the usefulness of a proof step has been 
considered, for example as part of the HOLStep [18], where various machine 
learning methods try to predict if particular inferences are needed in a proof. 
All these tasks are different from the task that we propose in the current paper. 

Various proof automation systems have emerged to construct proofs by tac- 
tic prediction and proof search. SEPIA infers tactics for Coq by tactic trace 
and automata [13]. TacticToe [12] and Tactician [5,42] apply classical statistical 
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learning techniques such k-nearest neighbors [9] and random forests [7] to gen- 
erate tactic predictions based on the before states. Several systems use neural 
networks for the same task, e.g. HOList [4], CoqGym [41], and Lime [40]. These 
are all different from the current work that considers the after states as well. 

Autoformalization [20] is a machine translation task applied to formal mathe- 
matical proofs. The accuracy of the best methods applied to the task is still very 
weak in comparison with human formalization [37], however, the neural methods 
already show some minimal understanding of the meaning of formalization, for 
example by finding equivalent formulations. Again this is a different task from 
the one considered in the current work. 


7 Conclusion 


In this paper, we propose a new machine learning task, with which we aim to cap- 
ture the semantics of tactics in formal mathematics. Based on a dataset of almost 
160 thousand proof states we consider synthesizing a tactic that transforms a 
before state to the expected after states. We implement three novel character- 
izations to describe the transformation: feature difference, anti-unification, and 
tree difference. The results of the experiments confirm the effectiveness of our 
characterizations. Two applications of the task are discussed: tactic suggestion 
for declarative proofs and proof shortening. 

In the future, we will investigate if tactic embeddings can be used directly. 
We can also try to estimate the after states by calculating the embeddings of 
the before state and the tactic or align tactics between systems in a similar way 
to how concepts are already aligned between systems [11]. 
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Abstract. We describe a translation from a fragment of SUMO (SUMO- 
K) into higher-order set theory. The translation provides a formal seman- 
tics for portions of SUMO which are beyond first-order and which have 
previously only had an informal interpretation. It also for the first time 
embeds a large common-sense ontology into an interactive theorem prov- 
ing system. We further extend our previous work in finding contradictions 
in SUMO from first-order constructs to include a portion of SUMO’s 
higher-order constructs. Finally, using the translation, we can create 
problems that can be proven using higher-order interactive and auto- 
mated theorem provers. This is tested in several systems and used to 
form a corpus of higher-order common-sense reasoning problems. 


Keywords: ontology - theorem proving - Megalodon - theorem 
proving - automated theorem proving - automated reasoning - SUMO 


1 Introduction and Motivation 


The Suggested Upper Merged Ontology (SUMO) [15,16] is a comprehensive 
ontology of around 20,000 concepts and 80,000 hand-authored logical statements 
in a higher-order logic. It has an associated integrated development environment 
called Sigma [19]! that interfaces to theorem provers such as E [22] and Vampire 
[12]. In previous work on translating SUMO to the TPTP [25] THF (Typed 
Higher-order Form) [1] format, a syntactic translation to THF was created but 
did not resolve many aspects of the intended higher-order semantics of SUMO. 

In this work, we lay the groundwork for a new translation to a language for 
higher-order automated theorem provers based on expressing SUMO in higher- 
order set theory. We believe this will attach to SUMO a stronger set-theoretical 
interpretation that will allow deciding more queries and provide better intu- 
ition for avoiding contradictory formalizations. Once this is done, our plan is to 
train ENIGMA-style [5-8] query answering and contradiction-finding [23] AITP 
systems on such SUMO problems and develop autoformalization [9—-11, 28] meth- 
ods targeting common-sense reasoning based on SUMO. We believe that this is 
the most viable path towards common-sense reasoning that is both trainable, 
but also explainable and verifiable, providing an alternative to language models 
which come with no formal guarantees. 


1 https: //www.ontologyportal.org. 


© The Author(s) 2023 
U. Sattler and M. Suda (Eds.): FroCoS 2023, LNAI 14279, pp. 255-274, 2023. 
https: //doi.org/10.1007/978-3-031-43369-6_14 


256 C. E. Brown et al. 


1.1 Related Work and Contributions 


In earlier work, we described [19] how to translate SUMO to the strictly first- 
order language of TPTP-FOF [20] and TF0 [17,18,26]. SUMO has an extensive 
type structure and all relations have type restrictions on their arguments. Trans- 
lation to TPTP FOF involved implementing a sorted (typed) logic axiomatically 
in TPTP by altering all implications in SUMO to contain type restrictions on 
any variables that appear. 

In [21] 35 SUMO queries were converted into challenge problems for first- 
order automated theorem provers. In many cases, first-order ATPs can prove 
the corresponding problem. However, some of the queries involve aspects of 
SUMO that go beyond first-order representation. For example, one of the queries 
involves a term-level binder («).? Several of the queries also involve row variables 
(also called sequence variables), i.e., variables that should be instantiated with 
a list of terms. We discuss here several such examples to motivate the trans- 
lation to higher-order set theory. We then embed SUMO into the Megalodon 
system, providing, to our knowledge, the first representation of a large common- 
sense ontology within a interactive theorem prover (ITP). We then consider 
the higher-order problems obtained via the translation. This provides a set of 
challenge problems for higher-order theorem provers that come from a different 
source than formalized mathematics or program verification. 

The rest of the paper is organized as follows. In Sect.2 we introduce the 
SUMO-K fragment of SUMO, an extension of the first-order fragment of SUMO. 
We also show there examples in SUMO that motivate the extensions. Section 3 
describes a translation from SUMO-K into a higher-order set theory. We have 
constructed interactive proofs of the translated form of 23 SUMO-K queries. We 
describe several of the proofs in Sect. 4. From the interactive proofs we obtain 
4880 ATP problems and we measure the performance of higher-order automated 
theorem provers on this problem set in Sect.5. Section6 describes the planned 
extensions and Sect. 7 concludes. Our code and problem set are available online.’ 


2 The SUMO-K Fragment 


We define a fragment of SUMO we call SUMO-K. This extends the first-order 
fragment of SUMO with support for row variables, variable arity functions 
and relations, and the « class formation term binder.“ Elements of SUMO not 
included in SUMO-K are temporal, modal and probabilistic operations. 

We start by defining SUMO-K terms, spines (lists of terms) and formulas. 
Formally, we have standard variables (x), row variables (p) and constants (c). 


2 Note that by “term-level binder” we mean a binder that yields a term. By way of 
constrast, V and 4 are formula-level binders. «x is used to form classes in SUMO. 
Informally, one can think of Kx. as the class {z|w}. 

3 http: //grid01.ciirc.cvut.cz/~ chad /sumo2set-0.9.tgz. 

4 SUMO classes should not be confused with set-theoretic classes. Our use of “class” 
in this paper will always refer to SUMO classes. 
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We will also have signed rationals (q) represented by a decimal expression with 
finitely many digits (i.e., those rationals expressible in such a way) as terms. We 
define by mutual recursion the sets of SUMO-K terms t, SUMO-K spines s and 
SUMO-K formulas ~ as follows: 


u= oe s)|(c s)|(Ka.)|Real|Neg|Nonneg|(t + Hl — DI * D/E) 

s u=Tts|-|plpt--- t 

y = TIEDE > DIMA DIG V DI > Y)|(Ve.h)|(Ax-b)|(Ve.h)|(Gp-) 
| (t=t)|(instance t t)|(subclass t t)|(t < t)|(t < t)|(e s) 


The definition is mutually recursive since the term «xz. depends on the formula 
yw. Of course, K, V and J are binders. In practice, most occurrences of p are at 
the end of the spine. In some cases, however, extra arguments t),...,t, occur 
after the p. The idea is that p will be a list of arguments and tı,...,tn will be 
appended to the end of that list. Note that at most one row variable can occur 
in a spine. 


2.1 Implicit Type Guards 


Properly parsing SUMO terms and formulas requires mechanisms for inferring 
implicit type guards for variables (interpreted conjunctively for « and J and via 
implication for Y). Free variables in SUMO assertions are implicitly universally 
quantified and are restricted by inferred type guards, as described in [19]. In 
previous translations targeting first-order logic, relation and function variables 
are instantiated during the translation (treating the general statement quantify- 
ing over relations and functions as a macro to be expanded). Since the current 
translation will leave these as variables, we must also deal with type guards that 
are not known until the relation or function is instantiated. 


2.2 Variable Arity Relations and Functions 
Consider the SUMO relation partition, declared as follows: 


(instance partition Predicate) 

(instance partition VariableArityRelation) 
(domain partition 1 Class) 

(domain partition 2 Class) 


The last three items indicate that partition has variable arity with at least 2 
arguments, both of which are intended to be classes. If there are more than 2 
arguments, the remaining arguments are also intended to be classes. In general, 
the extra optional arguments of a variable arity relation or function are intended 
to have the same domain as the last required argument. We will translate partition 
to a set that encodes not only when the relation should hold, but also its domain 
information, its minimum arity and whether or not it is variable arity. 
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Two other variable arity relations (with the same arity and type information 
as partition) are exhaustiveDecomposition and disjointDecomposition. The follow- 
ing is an example of a SUMO-K assertion relating these concepts: 


Vp.partition p — exhaustiveDecomposition p ^A disjointDecomposition p. 


Previous translations to first-order logic expanded this assertion into several 
facts for different possible arities (using different predicates partitions, partition, 
etc.), up to some limit. The following is an example of a partition occurring in 
Merge.kif? with 6 arguments: 


(partition Word Noun Verb Adjective Adverb ParticleWord) 
From this one should be able to infer the following query: 


Example 1 (wordex). 


(query (exhaustiveDecomposition 
Word Noun Verb Adjective Adverb ParticleWord) ) 


However, the corresponding first-order problem will not be provable unless the 
limit on the generated arity is at least 6. Our translation into set theory will free 
us from the need to know such limits in advance. 


2.3 Quantification over Relations 


Merge.kif includes assertions that quantify over relations. The following is an 
example of such an assertion: 


(=> 
(and 
(subrelation ?REL1 ?REL2) 
(instance ?REL1 Predicate) 
(instance ?REL2 Predicate) 
(?7REL1 @ROW) ) 
(?7REL2 @ROW) ) 


In previous first-order translations such assertions are instantiated with all 
Rand R’ where (subrelation R R’) is asserted. One of the 35 problems from [21] 
(TQG22) makes use of the SUMO assertion that son is a subrelation of parent and 
the macro expansion style of first-order translation is sufficient to handle this 
example. However, the macro expansion approach is insufficient to handle hypo- 
thetical subrelation assertions. The following is an example of a query creating 
a hypothetical subrelation assertion: 


5 Merge.kif is the main SUMO ontology file. While Merge.kif evolves over time, 
we work with a fixed version of the file from January 2023. Latest versions of it 
and all the other files that make up SUMO are available at https://github.com/ 
ontologyportal/sumo. 
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Example 2 (TQG22alt4). 


(query (=> (exists (7X) (employs ?X ?X)) 
(not (subrelation employs uses) ))) 


During the process of answering this query we will assume employs is a subrela- 
tion of uses and then must instantiate the general assertion about subrelations 
with employs and uses. Our translation to set theory will permit this. 


2.4 Kappa Binders 


One of the 35 queries from [21] (TQG27) has the following local assumption 
making use of a «-binder. 


Example 3. The example TQG27 includes three assertions: 

(A1) instance Planet Class, (A2) subclass Planet AstronomicalBody, and (the one 
with a «-binder) (A3) instance o (Kp.instance p Planet A attribute p Earthlike). 
Informally, one can read (A3) as o € {p|p is an Earthlike planet}. The query is 
(Q) instance o Planet. 


The query should easily follow by eliminating the «-abstraction. The first-order 
problem generated in [21] drops the assumption with the «-abstraction (A3), 
making the problem unlikely to be provable (at least not for the intended reason). 
Our translation to set theory will handle «-binders and the translation of this 
problem will be provable in the set theory. 


2.5 Real Arithmetic 


Six of the 35 examples from [21] involve some real arithmetic. Two simple exam- 
ple queries are the following: 


Example 4 (TQG3). 


(instance Number3-1 NonnegativeRealNumber) 
(query (not (instance Number3-1 NegativeRealNumber) ) ) 


Example 5 (TQG11). (query (equal 12 (MultiplicationFn 3 4))) 


For the sake of brevity we represent the first problem as having one local con- 
stant n, one local assumption instance n Nonneg and the query (conjecture) 
—(instance n Neg). We will translate signed rationals with a finite decimal expan- 
sion to real numbers represented as sets. We will also translate Real to be equal 
to the set of reals R. Furthermore we translate the operations +, —, x, /, < and < 
to have the appropriate meaning when applied to two reals.” We then translate 


6€ We use a fixed construction of the reals, but the details of this are not relevant here. 

T To be more precise, we are using a specific set of reals constructed in the higher- 
order set theory, and operations (e.g., multiplication) are the expected set-theoretic 
operations on that set of reals. For simplicity, our set-theoretic division is a total 
function returning 0 when the denominator is 0. 
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Neg to {x € R|x < 0} and Nonneg to {x € R|0 < x}. Using the properties of the 
set-theoretic encoding, the translated queries above are set-theoretic theorems. 

In addition to direct uses of arithmetic as in the examples above, arithmetic 
is also often used to check type guard information. This is due to the fact that 
a spine like tı t2 p will use subtraction to determine that under some constraints 
the it” element of the corresponding list will be the (i — 2)"¢ element of the list 
interpreting p. 


3 Translation of SUMO-K to Set Theory 


3.1 High Level Overview: Sets, Terms, Spines and Formulas 


Our translation maps terms t to sets. The particular set theory we use is higher- 
order Tarski-Grothendieck as described in [4]. The details of this set theory 
are not important here. We only note that we have €, C (which will be used 
to interpret SUMO’s instance and subclass) and that we have the ability to 
A-abstract variables to form terms at higher types. The main types of interest 
are J (the base type of sets), o (the type of propositions), 1 > ų (the type of 
functions from sets to sets) and ¿ — o (the type of predicates over sets). 


Terms: When we say SUMO terms t are translated to sets, we mean they are 
translated to terms of type ų in the higher-order set theory. 


Spines: Spines s are essentially lists of sets (of varying length). We translate 
them as functions that encode finite sequences. These functions are formally of 
the general type . — ı. However, we only use them when restricted to natural 
numbers, i.e., arguments n € w (where w is the set of finite ordinals). We also 
maintain the invariant that the function returns the empty set on all but finitely 
many n € w. An auxiliary function listset : (1 — 1) — ų gives a set-theoretic 
representation of the list by restricting its domain to w.? 


Tagging, Untagging, Length: To avoid confusion with the empty set being 
on a list, we tag elements of lists to ensure they are nonempty. Let |: 4 — u 
be such a tagging function (injective on the universe of sets) and U : 1 > 2 
be an untagging function. We then define nil : « —> ų to be constantly @ and 
cons: 4 > (tt) >t > ı to take a set x and a list | to the function mapping 
0 tol z and i + 1 tol i for i € w. We also define a function len : (1 > 1) > 4 
by Al.{i € wil i 4 Ø}!° giving us the length of the list (assuming it is a list). 
Informally, a spine to---t,_1 is thus a function taking i to I(t;) for each i € 
{0,...,2—1} where t; is the set-theoretic value of t; and | the tagging function. 


8 Tarski-Grothendieck is a set theory in which there are universes modeling ZFC set 
theory. These set-theoretic universes should not be confused with the universe of 
discourse Univ1 introduced below. 

° We include listset since sometimes a list needs to be considered as a set. 

10 Note that by design this set is the finite ordinal giving the length of the list. 
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Formulas: The translation of a SUMO formula ~ can be thought of either 
as a set (which should be one of the sets 0 or 1) or as a proposition. We also 
sometimes coerce between type + and o by considering the sets 0 and 1 to be sets 
corresponding to false and true. Let P: 1 > o be AX. € X and let B : o +1 be 
Ap.if p then 1 else 0. We use these functions as coercions between + and o. 


3.2 Motivating Examples 


Before describing the translation in more detail, we give a few more simple 
examples to explain various aspects of the translation and motivate our choices. 


Univl and Kappa: Let Univ1 be a set. This set is intended to be a uni- 
verse of discourse in which most (but not all) targets of interpretation for t 
will live. Specifically, we will map the SUMO-type Class to the set o Univ1 
(the power set of the universe). We take all SUMO-types except the four spe- 
cial cases Class, SetOrClass, Abstract and Entity to be sets in œ Univl. Con- 
sequently, if a SUMO object is an instance of some class other than Class, 
SetOrClass, Abstract and Entity, we will know that the object is a member of 
Univ1. Due to this we choose to translate «-binders using simple separation 
bounded by Univl. Reconsidering TQG27 discussed in Sect. 2.4 we translate 
instance o (Kp.instance p Planet A attribute p Earthlike) to a set-theoretic propo- 
sition! of the form o € {p € Univl|---p € PLANET ^ ---} (only partially 
specified at the moment).'? From this set-theoretic proposition we can easily 
derive o € PLANET to solve the set-theoretic version of TQG27. 


Variable Arity and Type Guards: As mentioned above, partition is a vari- 
able arity relation of at least arity 2 where every argument must be of SUMO- 
type Class. We will translate partition to a set PA containing multiple pieces 
of information. The behavior of PA as a relation is captured by the results one 
obtains by applying it to a set encoding a list of sets (via a set-theoretic operation 
ap : ¿ > ı > 1). We can apply an abstract function arity : 1 > + to obtain the min- 
imum arity of PA. We can apply an abstract predicate vararity : 1 — o to encode 
that PA has variable arity. Likewise we can apply an abstract domseq : 1 > 4 > 2 
to PA and an 72 € w to recover the intended domain of argument i of PA. These 
extra pieces of information are important to determine type guards in the pres- 
ence of function and relation arguments. 

In the specific case of partition the translation yields a set PA such that 
arity PA = 2, vararity PA is true and for i € {0,1,2}, domseq PA i = p Univl. The 
value of domseq PA 2 determines the intended domain of all remaining (optional) 
arguments of the relation. (Note that SUMO indexes the first argument by 1 
while in the set theory the first argument is indexed by 0.) The SUMO assertion 


(partition Word Noun Verb Adjective Adverb ParticleWord) 


11 A set-theoretic proposition is a closed formula in the language of higher-order set 
theory [4]. 

12 Note that the SUMO constant is Planet while its translated set-theoretic counterpart 
is PLANET. 
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translates to the set-theoretic statement!® 


P (ap PA (listset (cons Word (cons Noun (cons Verb (cons Adjective 
(cons Adverb (cons ParticleWord nil)))))))). 


Recall the SUMO-K assertion 
Vp.partition p — exhaustiveDecomposition p ^A disjointDecomposition p. 


In this case the translation also generates type guards for the row variable p. Let 
PA, ED and DD be the sets corresponding to the SUMO constants partition, 
exhaustiveDecomposition and disjointDecomposition. Essentially, the assertion 
should only apply to p when p has at least length 2 and every entry is a (tagged) 
class. The translated set-theoretic statement (with type guards) is 


Vp:t— t.dom_of (vararity PA) (arity PA) (domseq PA) p 
— dom_ of (vararity ED) (arity ED) (domseq ED) p 
— dom_ of (vararity DD) (arity DD) (domseq DD) p 
— P (ap PA p) > P (ap ED p) A P (ap DD p) 


The statement above makes use of a new definition: dom_ of : o > 4 > (1 > 2) 

(t > t) > o. The first argument of dom_of is a proposition encoding whether 
or not the function or relation is variable arity. In this case, all three of the 
propositions are variable arity (with the same typing information for all three). 
In the variable arity case dom_ of T n D p is defined to be dom_of_varar n D p 
where dom_of_varar: 4 > (t > 1) > (t > 1) > o, nis the minimum arity, D 
is the list of domain information and p is the list we are requiring to satisfy the 
guard. dom_of_varar n D p is defined to hold if the following three conditions 
hold: 


1. n C len p (p has at least length n) 
2. Vien,U (p12) € Di and 
3. Vie€lenp,nCiru (pile Dn. 


For fixed arity, dom_ of is defined via a simpler dom_ of _ fixedar condition. 
Another SUMO assertion about partitions is 


(=> 
(partition ?SUPER ?SUB1 ?SUB2) 
(partition ?SUPER ?SUB2 ?SUB1)) 


In this case there are three standard (nonrow) variables needing type guards 
in the translation. Roughly speaking, domseq PA has the information we need, 
but in general we must modify it to be appropriate for variable arity relations. 
For this reason domseqm : ¿ — ų — u is defined to be 


Ari.if vararity r then domseq r (if i € arity r then i else arity r) else domseq r i. 


13 Note that we omit parentheses via the usual convention that implication is right 
associative, i.e., 6 > Y — € means ¢ — (p — £). Note also this is logically equivalent 


to dAw > E. 
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The translated statement is 


VXYZ.X € domseqm PA 0 — Y € domseqm PA 1 — Z € domseqm PA 2 
— Z € domseqm PA 1 — Y € domseqm PA 2 
— P(ap PA (cons X (cons Y (cons Z nil))))) 
— P(ap PA (cons X (cons Z (cons Y nil))))). 


A simpler translation for handling type guards in this example could avoid 
the use of dom_ of and domseqm and instead look up the arity and typing infor- 
mation for partition, etc. This translation would not work in general since SUMO 
assertions quantify over relations, in which case the particular type guards are 
not known until the relation variables are instantiated. Consider the SUMO-K 
formula 


YRıRə2.VYp.subrelation Rı Rə A instance Rı Predicate — instance Rə Predicate 
> Ri p> Ro p. 


This translates to the set-theoretic proposition 


VR Ro: ı.Vp : tı —> ı.Rı € domseqm SR 0 > Rə € domseqm SR 1 —> 
Rı € E> Rə € E — dom_ of (vararity Rı) (arity Ri) p 
— dom_of (vararity R2) (arity R2) p 
— P (ap SR (cons R; (cons Rs nil))) A Rı € PR > Rp € PR > P (ap Ri p) 
— P (ap Re p) 


where E, SR and PR are the sets corresponding to the SUMO constants Entity, 
subrelation and Predicate. Here the type guards on p depend on R, and Rz. Two 
special cases are the type guards R; € E which are derived from the use of R; as 
the first argument of instance. 


3.3 The Translation 


We now describe the translation itself. A first pass through the SUMO files given 
records the typing information from domain, range, domainsubclass, rangesubclass 
and subrelation assertions. A finite number of secondary passes determines which 
names will have variable arity (either due to a direct assertion or due to being 
inferred to be in a variable arity class). 

The final pass translates the assertions, and this is our focus here. Each 
SUMO-K assertion is a SUMO-K formula y which may have free variables in it. 
Thus if we translate the SUMO-K formula y into the set-theoretic proposition 
y’, then the translated assertion will be 


Vary En-G1 > Gm > y 


where z£1,..., £n are the free variables in y and G4,..., Gm are the type guards 
for these free variables. Note that some of these free variables may be for spine 


14 Tn practice with the current Merge.kif file, a single secondary pass suffices, but in 
general one might need an extra pass to climb the class hierarchy. 
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variables (i.e., row variables) and may have type  — v. Such variables may also 
have type guards. 

SUMO-K variables x translate to themselves where after translation x is a 
variable of type ų (ranging over sets). For SUMO-K constants c we choose a 
name c’ and declare this as having type 1. Rational numbers q with a finite 
decimal expansion are translated to the set calculating the quotient of the base 
ten numerator divided by the appropriate power of 10. For example, 11.2 would 
be translated to the term 1 * 10? + 1 x 10 +2 divided by 10 (where 1, 2 and 
10 are the usual finite ordinals and exponentiation by finite ordinals is defined 
by recursion). When a variable or constant is applied to a spine we translate 
the spine and use ap. As mentioned in Sect. 2.5 Real is translated to the set R, 
Neg is translated to {x € R|x < 0} and Nonneg is translated to {x € R|O < z}. 
The other arithmetical constructs are translated to sets, but we assume special 
properties such as 


Vay € R.ap ADD (cons x (cons y nil)) = £ +y, 


Vay € R.ap MULT (cons x (cons y nil)) =x- y 


and 


Vay € R.P (ap (LESSTHAN (cons x (cons y nil)))) = (z < y). 


— (x s) translates to (ap x (listset s’)) where s’ is the result of translating the 
SUMO-K spine s. 

— (c s) translates to (ap œ (listset s’)) where s’ is the result of translating the 
SUMO-K spine s and c’ is the chosen set as a counterpart to the SUMO-kK 
constant c. Arithmetical operations are handled the same way. 


The only remaining case for terms is « binder terms. 


— We translate (Kx.7)) to 
{x € Univl | Gi A...Gm A Y'} 


where G1,...,G m are generated type guards for x and 7’ is the result of 
translating the SUMO-K formula ~ to a set-theoretic proposition. Note that 
x ranges over Univl. 


The translations of spines is relatively straightforward, but a few points are 
worth mentioning. 


— The SUMO-K spine (t s) is translated to the list one gets by applying cons 
to I t onto s’ where t’ is the translation of t and s’ is the translation of s. 

— A spine variable p is translated to itself (a variable of type ¿ > 1). 

— In the case p tı ... ty we translate p to itself (a variable of type . > v) and 
translate each t; to a set t; and return the function that returns p j given 
j < len p and returns t; given len p + i (appending the two lists). 

— The empty spine is translated to nil. 
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We consider each case of a SUMO-K formula. The usual logical operators are 


translated as the corresponding operators: 


L and T translate simply to L and T. 

(= w) translates to =y’ where y is a SUMO-K formula which translates to 
the set-theoretic proposition y’. 

(ù — &) translates to Y’ — € where w and € are SUMO-K formulas translate 
to the set-theoretic propositions wy’ and €’. 

(ù = &) translates to Yy’ > £ where w and € are SUMO-K formulas translate 
to the set-theoretic propositions wy’ and €’. 

Theoretically, Y A € translates to w’ A &’. Practically speaking in SUMO-K 
conjunction is n-ary so it is more accurate to state that (and Yı ---Wn) 
translates to %1 A---A wi, where Y1, ..., Wn are SUMO-K formulas translate 
to the set-theoretic propositions w,...,v%,. 

Again, theoretically 7 V € translates to w’ V €. Practically, (or Yı --- Wn) 
translates to %1 V--- Vw, where Y1, ..., Wn are SUMO-K formulas translate 
to the set-theoretic propositions j,...,v,. 

Theoretically, Vx.y translates to Vz.G, > --- = Gm — w’ where y’ is the 
result of translating ~ and Gj,...,Gm are the generated type guards for 
x. Practically speaking, SUMO-K allows several variables to be universally 
quantified at once, so it is more accurate to say (forall (x1... £n) Y) trans- 
lates to Vx ,...%p.Gy ee Gm yw’ where z1,...,£n are variables, 
Gi,...,Gm are the generated type guards for these variables and w” is the 
set-theoretic proposition obtained by translating w. That is, each G; is a type 
guard induced by one of the variables zj, with all the guards computed for 
the n variables simultaneously. While we could combine the guards into a 
single conjunction, we do not. 

Vp. is translated similarly, but with type guards for the row variable p. 
Again, theoretically Jz. translates to drz.G, A+- A Gm Aw’, where y’ is 
the set-theoretic proposition obtained by translating the SUMO-K formula 
w, but generalized to handle quantifying multiple variables. 

dp.y is translated similarly, but with type guards for the row variable p. 

(tı = tg) translates to t} = t} where tı and t3 are SUMO terms which 
translate to sets t} and t3. 


We use set membership and inclusion to interpret instance and subclass. 


4 


(instance tı t2) translates to t} € t where tı and tz are SUMO terms which 
translate to sets t} and t3. 
(subclass t; t2) translates to t} C t where tı and tz are SUMO terms which 
translate to sets t} and t3. 


Interactive Proofs of Translated SUMO Queries 


The motivating set of examples were the 35 example queries from [|21], now 
expanded!®. Six of the original examples involve temporal reasoning. We omit 


15 https: //github.com /ontologyportal/sumo/tree/master /tests. 
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these for the moment, leaving a future translation to handle temporal and modal 
reasoning. 9 questions involve too many arguments for the existing first-order 
translation with macro expansion to work, but which are handled by our new 
translation. Among the remaining problems, 5 require some arithmetical reason- 
ing, which use preexisting translations to standard first-order logic (FOF) and 
to an extension of first-order logic with arithmetic (TFF). For the remaining 
problems, the results of (at least) 5 were still not provable by the ATPs Vampire 
or E within a 600s timeout. 

We carefully looked at the set-theoretic translation of 13 of the problems 
that were too difficult for first-order provers (for any of the above reasons other 
than the use of temporal or modal reasoning). We either did an interactive 
proof or found slight modifications of the problem that could be interactively 
proven. The interactive proofs were done in Megalodon (the successor to the 
Egal system [4]). One advantage of having such a translation is the ability to 
attempt interactive proofs and recognize what may be missing from Merge.kif 
or the original query. We also did interactive proofs of 4 problems that the first- 
order provers could prove. We additionally included the 6 problems dealing with 
variable arity and row variables (e.g., Example 1). In total we have 23 SUMO- 
K queries translated to set-theoretic statements that have been interactively 
proven. We briefly describe some of the interactive proofs here. 

An example with a particularly simple proof is TQG27 (Example 3), the 
example with a «-binder. The assertion with the «-binder translates to the set- 
theoretic proposition 


o E€ {p € Univ1| p € EA p € domseqm attribute 0 A p € Planet 
AP (ap attribute (listset (cons p (cons Earthlike nil))))}. 


The query translates simply to o € Planet. 

When interactively proving the translated query in Megalodon, we are free 
to use statements coming from three sources: set-theoretic propositions already 
previously proven in Megalodon (or are axioms of Tarski-Grothendieck), propo- 
sitions resulting from the translation of formulas in Merge.kif, and propositions 
resulting from translating formulas local to the example. In this case we only 
need two propositions: the translated formula local to the example given above 
and one known set-theoretic proposition of the form: 


VX :YVP:1—>0oVx:.x E€ {£E X|Pr}oreXAP az. 
From the two propositions we easily obtain the conjunction 


o € Univl Ao € E Ao € domseqm attribute 0 A o € Planet 
AP (ap attribute (listset (cons o (cons Earthlike nil)))). 


After this first step, a series of steps eliminate the conjunctions until we have 
the desired conjunct o € Planet. 

Another relatively simple example is TQG11 (Example 5) in which we must 
essentially prove 12 is 3-4. To be more precise we must prove 


1-10 +2 = ap MULT (listset (cons 3 (cons 4 nil))). 
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As mentioned in Sect. 3.3 the translation adds the proposition 
Vay € R.ap MULT (cons x (cons y nil)) = a-y 


which will be useful here. In the interactive proof, we first prove a claim that 
every natural number (finite ordinal) is a real number (i.e., w C R, which is 
true for the representation of the reals being used). This claim is then used to 
prove 3 € R and 4 € R. This allows us to reduce the main goal to proving 
1-10+2 = 3-4. This goal is then proven by an unsurprising sequence of rewrites 
using equations defining the behavior of + and - on finite ordinals. (Many details 
are elided here, such as the fact that there are actually two different operations 
+, one on reals and one only on finite ordinals and that they provably agree on 
finite ordinals.) 

We next consider the proof of the translation of Example 2. The set-theoretic 
proposition resulting from translating the query is 


(3x.x E€ domseqm employs 0 A x E€ domseqm employs 1 
AP (ap employs (listset (cons x (cons x nil))))) 
— —P (SR (listset (cons employs (cons uses nil)))). 


We begin the interactive proof by proving the following sequence of claims: 


len nil = 0. 

VX VR:t—> 1.Vn.nat_p n > len R = n —> len (cons X R) = ordsucc n. 
Yy.—~vararity y — Vi.domseqm y i = domseq y i. 

Yy.—~vararity y — Yxi.x € domseq y i > x € domseqm y i. 
YX.YR:ı—> ı.cons X R0=IX 

YVn.nat_p n > YX YR: ı —> ı.cons X R (ordsucc n) =R n. 


D Ot jega: bon 


We can then rewrite domseqm employs into domseq employs. Starting the main 
body of the proof, we assume we have an x such that x € domseq employs 0, 
x € domseq employs 1 and P (ap employs (listset (cons x (cons x nil)))). We further 
assume P (SR (listset (cons employs (cons uses nil))) and prove a contradiction. 
Using the translated Merge.kif type information from employs we can infer x is 
an autonomous agent and an object. Likewise we can infer employs is a predicate 
and a relation, and the same for uses. The contradiction follows from two claims: 
P (ap uses (cons x (cons x nil))) and =P (ap uses (cons x (cons «x nil))). 

We first prove P (ap uses (cons x (cons æ nil))). We locally let ROW be 
cons x (cons x nil) and use the claims above prove from ROW 0 = I z, ROW 1 = 
| a, U (ROW 0) = a, U (ROW 1) = z and len ROW = 2. We can then essentially 
complete the subproof using the local assumptions 


P (ap employs (listset (cons x (cons x nil)))) 


and 
P (SR (listset (cons employs (cons uses nil)))) 


along with the translation of the following Merge.kif formula: 
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(=> 
(and 
(subrelation ?REL1 ?REL2) 
(instance ?REL1 Predicate) 
(instance ?REL2 Predicate) 
(?7REL1 @ROW) ) 
(?7REL2 @ROW) ) 


To complete the contradiction we prove —P (ap uses (cons x (cons x nil))). 
The three most significant Merge.kif formulas whose translated propositions are 
used in the subproof are: 


(instance uses AsymmetricRelation) 
(subclass AsymmetricRelation IrreflexiveRelation) 
(=> 
(instance ?REL IrreflexiveRelation) 
(forall (?INST) 
(not 
(?REL ?INST ?7INST)))) 


That is, Merge.kif declares that uses is an asymmetric relation, every asym- 
metric relation is an irreflexive relation, and that irreflexive relations have the 
expected property of irreflexivity. 


5 ATP Problem Set 


After interactively proving the 23 problems, we created THO!® problems 
restricted to the axioms used in the proof. This removes the need for the higher- 
order ATP to do premise selection. Additionally we used Megalodon to analyze 
the interactive proof to create a number of subgoal problems for ATPs — rang- 
ing from the full problem (the initial goal to be proven) to the smallest subgoals 
(completed by a single tactic). For example, the interactive proofs of Examples 1, 
2 and 5 generate 415, 322 and 100 THO problems, respectively. In total analy- 
sis of the interactive proofs yields 4880 (premise-minimized) THO problems for 
ATPs. In Table 1 we give the results for several higher-order automated theorem 
provers (Leo-III [24], Vampire [13], Lash [3], Zipperposition [27], E [22]), given 
a 60s timeout. 


6 Future Work 


The primary plan to extend the translation is to include temporal and modal 
operators. SUMO includes many modal operators including necessity, possibility, 


16 THO was introduced as THF in [2] as a core language for representing typed higher- 
order formulas (in the sense of Church’s simple type theory) for automated theorem 
provers. 
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Table 1. Number of Subgoals Proven Automatically in 60s 


Problem Subgoals | Zipperposition | Vampire E Lash Leo-III 
TQG1 50 50 (100%) 50 (100%) |50 (100%) | 50 (100%) | 50 (100%) 
TQG3 20 20 (100%) 20 (100%) | 14 (70%) 20 (100%) | 8 (40%) 
TQG7 195 188 (96%) 185 (95%) | 180 (92%) | 160 (82%) | 158 (81%) 
TQG9 19 19 (100%) 19 (100%) |19 (100%) | 19 (100%) | 19 (100%) 
TQG10 112 112 (100%) 112 (100%) | 100 (89%) | 58 (52%) 96 (86%) 
TQGI11 100 76 (76%) 39 (39%) 67 (67%) 45 (45%) 13 (13%) 
TQG19 37 34 (92%) 22 (59%) 20 (54%) 37 (100%) | 11 (30%) 
TQG20 41 34 (83%) 22 (54%) 20 (49%) 41 (100%) |13 (32%) 
TQG21 207 154 (74% 150 (72%) |143 (69%) |101 (49%) |56 (27%) 
TQG22alt3 319 246 (77% 214 (67%) |193 (61%) |197 (62%) |136 (43% 
TQG22alt4 322 251 (78% 218 (68%) |197 (61%) |201 (62%) |142 (44% 
TQG22 315 271 (86% 224 (71%) |212 (67%) |201 (64%) |142 (45% 
TQG23 67 61 (91%) 67 (100%) |42 (63%) 51 (76%) 38 (57% 
TQG25alt1 910 652 (72% 526 (58%) |580 (64%) |529 (58%) |246 (27% 
TQG27 T 7 (100%) 7 (100%) 7 (100%) 7 (100%) 7 (100% 
TQG28alt1 600 428 (71% 386 (64%) |349 (58%) |261 (44%) |213 (36% 
TQG30 4 4 (100%) 4 (100%) 3 (75%) 4 (100%) 4 (100% 
TQG33 112 82 (73%) 83 (74%) 79 (71%) 85 (76%) 36 (32% 
TQG45 162 136 (84% 131 (81%) |128 (79%) |106 (65%) |36 (22% 
TQG46 344 258 (75% 215 (62%) |225 (65%) |163 (47%) |144 (42%) 
TQG47 186 141 (76% 113 (61%) | 109 (59%) | 93 (50%) 79 (42% 
TQG48 336 249 (74% 234 (70%) | 219 (65%) | 184 (55%) | 146 (43%) 
wordex 415 315 (76% 255 (61%) | 236 (57%) | 284 (68%) | 143 (34%) 
Total 4880 3788 (78%) 3296 (68%) | 3192 (65%) | 2897 (59%) | 1936 (40%) 


deontological operators (obligation and permission) and modalities for knowl- 
edge, beliefs and desires. Each modality can be modelled using Kripke style 
semantics [14] (possible worlds with an accessibility relation). 

The following is an example of a SUMO formula in Merge.kif using modali- 


(=> 
(modalAttribute ?FORMULA Necessity) 
(modalAttribute ?FORMULA Possibility) ) 


17 Note that SUMO embeds several different modalities that have different axiomatiza- 
tions. Rather than assuming one particular modal logic axiomatization (S4, S5 etc.) 
by embedding different modal logics in higher-order logic we hope to determine if 
we can create a coherent system of axiomatizations while avoiding known paradoxes 
like the gentle murderer paradox. 
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The current translation simply skips these formulas as they are not in the SUMO- 
K fragment. If we only wanted to extend the translation to include necessity and 
possibility, we could change the translation to make the dependence on worlds 
explicit. The SUMO formula above could translate to the proposition 


Vw E€ WNYọ : i > ı.(YWv E W.R w v = P (ọ v)) > (Iv E€ W.R wv AP (y v)). 


Here W is a set of worlds and R is an accessibility relation on W. Note that 
the translated formula variable has type ¿ — ų instead of type vs to make the 
dependence of the formula on the world explicit. In general, terms, spines and 
formulas would depend on a world w and in an asserted formula the world w 
would be universally quantified (ranging over W) as above. 

If we took the approach above to model necessity and possibility, then to add 
deontic modalities later we would need a second set of worlds and accessibility 
relation. The translation of terms would then have type 1 — ų¿ — ų to account 
for the dependence on both kinds of worlds. In order to prevent needing to keep 
adding new dependencies for every modalities, our plan is to combine the sets of 
worlds and accessibility relations in an extensible way. Thus terms will translate 
to have type 1 — ı essentially giving dependence on a single set encoding a 
sequence of worlds (where we are open ended about the length of the sequence). 
Using this idea, the SUMO formula above would translate to something like 


Vw € (Hx € XW x) Vọ :ı > (Ww € (Ix E€ XW x).R mwv —P (vy v)) 
— (du € (Tx € XW xz) RmwvAP (y v)) 


where X is an index set (where each x € X corresponds to a modality being 
interpreted), m € X is the specific index for necessity and possibility, W x is 
the set of worlds for x, and R =z is a relation between w,v € Hx € X.W « that 
holds if the x components satisfy the accessibility relation over W «x and the 
other components of w and v do not change. This allows us to model an arbi- 
trary number of modalities using Kripke semantics while only carrying one world 
argument. Another advantage is that it minimizes the change to the translation 
of formulas in the SUMO-K fragment (without modalities). The only required 
change is to add a single dependence on w via a new argument and universally 
quantify over w if the formula is asserted. 

We have already done some experiments with this approach and it shows 
promise. The previous experiments need to be extended to include changes that 
have occurred to obtain the SUMO-K translation described in the present paper. 
Once this is done, we must ensure that translated examples both with modalities 
and the examples in this paper without modalities are provable interactively. 
We plan to also test automated theorem provers on the subgoals obtained from 
the interactive proofs. Doing so with the 23 examples in this paper will give an 
indication how much more difficult the translated problems become if the Kripke 
infrastructure to handle modalities is included. 

Another aspect of SUMO are modalities involving likelihood and probability. 
These cannot be modelled by Kripke semantics (as the modalities are not nor- 
mal). We are experimenting with using neighborhood semantics to include these 
modalities. 
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7 Conclusion 


We have described a translation from the SUMO-K fragment of SUMO into 
higher-order set theory. We have considered a number of examples that use 
aspects of SUMO-K that go beyond traditional first-order logic, namely vari- 
able arity functions and relations, row variables, term-level «-binders and arith- 
metic. We have described a number of interactive proofs of translated queries 
and tested higher-order automated theorem provers on problems obtained by 
doing premise selection using the corresponding interactive proofs. This gives a 
set of problems for automated theorem provers that come from the area of “com- 
mon sense reasoning,” an area quite different from the more common sources of 
formalized mathematics and program verification. On most of the examples, 
higher-order automated theorem provers cannot fully automatically prove the 
query, but they perform reasonably well on subgoal problems extracted from 
the interactive proofs. This gives an indication that the full problems (assum- 
ing premise selection) are not too far out of reach for current state of the art 
higher-order automated theorem provers. 
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